2011-02-03 03:41:58

by Nick Bowler

[permalink] [raw]
Subject: Regression: WARNINGS and lockdep spews in 2.6.38-rc3+ (bisected).

With 2.6.38-rc3+, I'm receiving nasty WARNINGS and lockdep spews on my
laptop. Other than the noise, things seem to work fine. The lockdep
warning is spewed as soon as the second of the following two events
occurs (in any order):

* a key on the keyboard is pressed.
* the radio switch is in the off position.

For example, pressing a key after booting with the switch in the off
position will trigger a dump. The following output was what I got when I
booted with the switch on, flipped the switch off, flipped the switch
back on, then pressed a key (note that the iwlagn errors are normal).
Bisection results follow the traces -- reverting the implicated commit
solves the issue.

------------[ cut here ]------------
WARNING: at /home/nick/misc/linux-2.6/kernel/softirq.c:159 _local_bh_enable_ip+0x49/0xe3()
Hardware name: 2242CTO
Modules linked in: nfs nfs_acl aes_x86_64 aes_generic nfsd lockd sunrpc exportfs ipv6 iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack iptable_filter xt_mark ip_tables snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_hda_codec_conexant snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_timer snd soundcore snd_page_alloc ufs ext2 vhost_net tun arc4 ecb crypto_blkcipher kvm_intel cryptomgr aead crypto_algapi kvm iwlagn iwlcore mac80211 btusb bluetooth cfg80211 e1000e sg joydev sr_mod cdrom uhci_hcd mmc_block ehci_hcd usbcore nls_base evdev thinkpad_acpi hwmon rfkill nvram acpi_cpufreq mperf sdhci_pci sdhci pcmcia mmc_core yenta_socket pcmcia_rsrc pcmcia_core psmouse loop rtc_cmos rtc_core rtc_lib
Pid: 10, comm: kworker/0:1 Not tainted 2.6.38-rc3-bisect-00062-g8c007a1 #92
Call Trace:
[<ffffffff8103f57b>] ? warn_slowpath_common+0x80/0x98
[<ffffffff8104a418>] ? del_timer_sync+0x4f/0x8e
[<ffffffff8103f5a8>] ? warn_slowpath_null+0x15/0x17
[<ffffffff81044c88>] ? _local_bh_enable_ip+0x49/0xe3
[<ffffffff81044d3a>] ? local_bh_enable+0xd/0xf
[<ffffffff8104a418>] ? del_timer_sync+0x4f/0x8e
[<ffffffff8104a3c9>] ? del_timer_sync+0x0/0x8e
[<ffffffffa00947fe>] ? rfkill_schedule_global_op+0x41/0x84 [rfkill]
[<ffffffffa0094858>] ? rfkill_schedule_evsw_rfkillall+0x17/0x19 [rfkill]
[<ffffffffa009495f>] ? rfkill_event+0x58/0x5a [rfkill]
[<ffffffff8125c980>] ? input_pass_event+0x97/0xea
[<ffffffff8125de13>] ? input_handle_event+0x460/0x46f
[<ffffffff8125df1b>] ? input_event+0x66/0x83
[<ffffffffa009becb>] ? tpacpi_send_radiosw_update+0x7f/0xc4 [thinkpad_acpi]
[<ffffffffa009c417>] ? hotkey_notify+0x507/0x581 [thinkpad_acpi]
[<ffffffff810bc242>] ? kfree+0xd2/0xeb
[<ffffffffa009b029>] ? dispatch_acpi_notify+0x29/0x2b [thinkpad_acpi]
[<ffffffff8119a18a>] ? acpi_ev_notify_dispatch+0x62/0x7c
[<ffffffff8118b323>] ? acpi_os_execute_deferred+0x24/0x31
[<ffffffff810511bf>] ? process_one_work+0x204/0x33b
[<ffffffff8105115e>] ? process_one_work+0x1a3/0x33b
[<ffffffff8118b2ff>] ? acpi_os_execute_deferred+0x0/0x31
[<ffffffff81053384>] ? worker_thread+0x136/0x255
[<ffffffff8105324e>] ? worker_thread+0x0/0x255
[<ffffffff810566da>] ? kthread+0x7d/0x85
[<ffffffff81003864>] ? kernel_thread_helper+0x4/0x10
[<ffffffff812ffed8>] ? restore_args+0x0/0x30
[<ffffffff8105665d>] ? kthread+0x0/0x85
[<ffffffff81003860>] ? kernel_thread_helper+0x0/0x10
---[ end trace aae23672f55877aa ]---
iwlagn 0000:03:00.0: Not sending command - RF KILL
iwlagn 0000:03:00.0: Error sending REPLY_REMOVE_STA: enqueue_hcmd failed: -5
iwlagn 0000:03:00.0: Error removing station 00:1e:e5:7a:b2:6f
iwlagn 0000:03:00.0: Not sending command - RF KILL
iwlagn 0000:03:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
iwlagn 0000:03:00.0: Error clearing ASSOC_MSK on BSS (-5)
iwlagn 0000:03:00.0: Not sending command - RF KILL
iwlagn 0000:03:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
iwlagn 0000:03:00.0: Error clearing ASSOC_MSK on BSS (-5)
cfg80211: Calling CRDA to update world regulatory domain
iwlagn 0000:03:00.0: Not sending command - RF KILL
iwlagn 0000:03:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
iwlagn 0000:03:00.0: Error clearing ASSOC_MSK on BSS (-5)
usb 4-2: USB disconnect, address 2
iwlagn 0000:03:00.0: RF_KILL bit toggled to enable radio.
usb 4-2: new full speed USB device using uhci_hcd and address 3

=================================
[ INFO: inconsistent lock state ]
2.6.38-rc3-bisect-00062-g8c007a1 #92
---------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
swapper/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
(&(&dev->event_lock)->rlock){?.+...}, at: [<ffffffff8125def6>] input_event+0x41/0x83
{HARDIRQ-ON-W} state was registered at:
[<ffffffff810656d4>] mark_held_locks+0x52/0x70
[<ffffffff8106593a>] trace_hardirqs_on_caller+0xf7/0x135
[<ffffffff81065985>] trace_hardirqs_on+0xd/0xf
[<ffffffff81044ac2>] __do_softirq+0x62/0x160
[<ffffffff8100395c>] call_softirq+0x1c/0x28
[<ffffffff81044cf2>] _local_bh_enable_ip+0xb3/0xe3
[<ffffffff81044d3a>] local_bh_enable+0xd/0xf
[<ffffffff8104a418>] del_timer_sync+0x4f/0x8e
[<ffffffffa00947fe>] rfkill_schedule_global_op+0x41/0x84 [rfkill]
[<ffffffffa0094858>] rfkill_schedule_evsw_rfkillall+0x17/0x19 [rfkill]
[<ffffffffa009495f>] rfkill_event+0x58/0x5a [rfkill]
[<ffffffff8125c980>] input_pass_event+0x97/0xea
[<ffffffff8125de13>] input_handle_event+0x460/0x46f
[<ffffffff8125df1b>] input_event+0x66/0x83
[<ffffffffa009becb>] tpacpi_send_radiosw_update+0x7f/0xc4 [thinkpad_acpi]
[<ffffffffa009c417>] hotkey_notify+0x507/0x581 [thinkpad_acpi]
[<ffffffffa009b029>] dispatch_acpi_notify+0x29/0x2b [thinkpad_acpi]
[<ffffffff8119a18a>] acpi_ev_notify_dispatch+0x62/0x7c
[<ffffffff8118b323>] acpi_os_execute_deferred+0x24/0x31
[<ffffffff810511bf>] process_one_work+0x204/0x33b
[<ffffffff81053384>] worker_thread+0x136/0x255
[<ffffffff810566da>] kthread+0x7d/0x85
[<ffffffff81003864>] kernel_thread_helper+0x4/0x10
irq event stamp: 559266
hardirqs last enabled at (559263): [<ffffffff811b40a5>] acpi_idle_enter_bm+0x236/0x275
hardirqs last disabled at (559264): [<ffffffff812ffd6a>] save_args+0x6a/0x70
softirqs last enabled at (559266): [<ffffffff81044bce>] _local_bh_enable+0xe/0x10
softirqs last disabled at (559265): [<ffffffff81044c0d>] irq_enter+0x3d/0x6f

other info that might help us debug this:
1 lock held by swapper/0:
#0: (&serio->lock){-.-...}, at: [<ffffffff81259018>] serio_interrupt+0x24/0x7e

stack backtrace:
Pid: 0, comm: swapper Tainted: G W 2.6.38-rc3-bisect-00062-g8c007a1 #92
Call Trace:
<IRQ> [<ffffffff81065442>] ? valid_state+0x17e/0x191
[<ffffffff8100e159>] ? save_stack_trace+0x28/0x45
[<ffffffff81065d18>] ? check_usage_forwards+0x0/0x87
[<ffffffff81065568>] ? mark_lock+0x113/0x22d
[<ffffffff8106687f>] ? __lock_acquire+0x2ac/0xd01
[<ffffffff81065482>] ? mark_lock+0x2d/0x22d
[<ffffffff81065482>] ? mark_lock+0x2d/0x22d
[<ffffffff81065482>] ? mark_lock+0x2d/0x22d
[<ffffffff81067338>] ? lock_acquire+0x64/0x81
[<ffffffff8125def6>] ? input_event+0x41/0x83
[<ffffffff812ff378>] ? _raw_spin_lock_irqsave+0x53/0x65
[<ffffffff8125def6>] ? input_event+0x41/0x83
[<ffffffff8125def6>] ? input_event+0x41/0x83
[<ffffffff81261fb6>] ? atkbd_interrupt+0x9a/0x574
[<ffffffff81259034>] ? serio_interrupt+0x40/0x7e
[<ffffffff8125a31c>] ? i8042_interrupt+0x28e/0x2a8
[<ffffffff8107e834>] ? handle_IRQ_event+0x20/0xae
[<ffffffff810805e6>] ? handle_edge_irq+0xde/0x12a
[<ffffffff810055c9>] ? handle_irq+0x1f/0x2a
[<ffffffff81004c37>] ? do_IRQ+0x48/0xaf
[<ffffffff812ffe13>] ? ret_from_intr+0x0/0x1a
<EOI> [<ffffffff8100a1aa>] ? native_sched_clock+0x2d/0x5f
[<ffffffff811b40ac>] ? acpi_idle_enter_bm+0x23d/0x275
[<ffffffff811b40a5>] ? acpi_idle_enter_bm+0x236/0x275
[<ffffffff8126c2ae>] ? cpuidle_idle_call+0x9f/0xd5
[<ffffffff81001e75>] ? cpu_idle+0x61/0xaa
[<ffffffff812ecdc9>] ? rest_init+0xcd/0xd4
[<ffffffff812eccfc>] ? rest_init+0x0/0xd4
[<ffffffff81682d3d>] ? start_kernel+0x3a6/0x3b1
[<ffffffff816822b1>] ? x86_64_start_reservations+0xb8/0xbc
[<ffffffff816823bb>] ? x86_64_start_kernel+0x106/0x115

1118e2cd33d47254854e1ba3ba8e32802ff14fdf is the first bad commit
commit 1118e2cd33d47254854e1ba3ba8e32802ff14fdf
Author: Yong Zhang <[email protected]>
Date: Wed Oct 20 15:57:32 2010 -0700

timer: Del_timer_sync() can be used in softirq context

Actually we have used del_timer_sync() in softirq context for a long time,
e.g. in __dst_free()::cancel_delayed_work().

So change the comments of it to warn on hardirq context only, and make
lockdep know about this change.

Signed-off-by: Yong Zhang <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>

:040000 040000 864a7fbdaeb5226c2cc47a49e56c2d5509d548be adb06c69b689b1ec1a9853e9d4f22b631683c164 M kernel

git bisect start
# bad: [afe8a887550f7cc24eb16449670b93f6b43c32d8] Merge branch 'next' of git://git.monstr.eu/linux-2.6-microblaze
git bisect bad afe8a887550f7cc24eb16449670b93f6b43c32d8
# good: [3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5] Linux 2.6.37
git bisect good 3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5
# bad: [949f6711b83d2809d1ccb9d830155a65fdacdff9] Merge branch 'staging-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6
git bisect bad 949f6711b83d2809d1ccb9d830155a65fdacdff9
# bad: [3c0cb7c31c206aaedb967e44b98442bbeb17a6c4] Merge branch 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm
git bisect bad 3c0cb7c31c206aaedb967e44b98442bbeb17a6c4
# good: [63e35cd9bd4c8ae085c8b9a70554595b529c4100] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
git bisect good 63e35cd9bd4c8ae085c8b9a70554595b529c4100
# bad: [9858a38ea3a940762ae3028cce88f686d0e0c28b] Merge branch 'sh-latest' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6
git bisect bad 9858a38ea3a940762ae3028cce88f686d0e0c28b
# bad: [6f46b120a96212b85cbdcb84a64c854dfd791ede] Merge branch 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect bad 6f46b120a96212b85cbdcb84a64c854dfd791ede
# good: [4eed11d5e24540dc133003b6e8f904cb747ac4bb] perf evsel: Auto allocate resources needed for some methods
git bisect good 4eed11d5e24540dc133003b6e8f904cb747ac4bb
# good: [28d9bfc37c861aa9c8386dff1ac7e9a10e5c5162] Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect good 28d9bfc37c861aa9c8386dff1ac7e9a10e5c5162
# good: [65b2074f84be2287e020839e93b4cdaaf60eb37c] Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect good 65b2074f84be2287e020839e93b4cdaaf60eb37c
# bad: [42cbd8efb0746b55112de45173219f76c54390da] Merge branch 'x86-amd-nb-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect bad 42cbd8efb0746b55112de45173219f76c54390da
# bad: [87de5ac782761a3ebf806e434e8c9cc205a87274] timers: Introduce timerlist infrastructure.
git bisect bad 87de5ac782761a3ebf806e434e8c9cc205a87274
# good: [dd6414b50fa2b1cd247a8aa8f8bd42414b7453e1] timer: Permit statically-declared work with deferrable timers
git bisect good dd6414b50fa2b1cd247a8aa8f8bd42414b7453e1
# bad: [1118e2cd33d47254854e1ba3ba8e32802ff14fdf] timer: Del_timer_sync() can be used in softirq context
git bisect bad 1118e2cd33d47254854e1ba3ba8e32802ff14fdf
# good: [6f1bc451e6a79470b122a37ee1fc6bbca450f444] timer: Make try_to_del_timer_sync() the same on SMP and UP
git bisect good 6f1bc451e6a79470b122a37ee1fc6bbca450f444

--
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)


2011-02-03 09:12:40

by Yong Zhang

[permalink] [raw]
Subject: Re: Regression: WARNINGS and lockdep spews in 2.6.38-rc3+ (bisected).

Hi Nick,

Thanks for reporting this.
Could you please try below patch?

Thanks,
Yong

---
From: Yong Zhang <[email protected]>
Subject: [PATCH] timer: use _local_bh_enable() in del_timer_sync()

Nick Bowler reported:
=================================
[ INFO: inconsistent lock state ]
2.6.38-rc3-bisect-00062-g8c007a1 #92
---------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
swapper/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
(&(&dev->event_lock)->rlock){?.+...}, at: [<ffffffff8125def6>] input_event+0x41/0x83
{HARDIRQ-ON-W} state was registered at:
[<ffffffff810656d4>] mark_held_locks+0x52/0x70
[<ffffffff8106593a>] trace_hardirqs_on_caller+0xf7/0x135
[<ffffffff81065985>] trace_hardirqs_on+0xd/0xf
[<ffffffff81044ac2>] __do_softirq+0x62/0x160
[<ffffffff8100395c>] call_softirq+0x1c/0x28
[<ffffffff81044cf2>] _local_bh_enable_ip+0xb3/0xe3
[<ffffffff81044d3a>] local_bh_enable+0xd/0xf
[<ffffffff8104a418>] del_timer_sync+0x4f/0x8e
[<ffffffffa00947fe>] rfkill_schedule_global_op+0x41/0x84 [rfkill]
[<ffffffffa0094858>] rfkill_schedule_evsw_rfkillall+0x17/0x19 [rfkill]
[<ffffffffa009495f>] rfkill_event+0x58/0x5a [rfkill]
[<ffffffff8125c980>] input_pass_event+0x97/0xea
[<ffffffff8125de13>] input_handle_event+0x460/0x46f
[<ffffffff8125df1b>] input_event+0x66/0x83
[<ffffffffa009becb>] tpacpi_send_radiosw_update+0x7f/0xc4 [thinkpad_acpi]
[<ffffffffa009c417>] hotkey_notify+0x507/0x581 [thinkpad_acpi]
[<ffffffffa009b029>] dispatch_acpi_notify+0x29/0x2b [thinkpad_acpi]
[<ffffffff8119a18a>] acpi_ev_notify_dispatch+0x62/0x7c
[<ffffffff8118b323>] acpi_os_execute_deferred+0x24/0x31
[<ffffffff810511bf>] process_one_work+0x204/0x33b
[<ffffffff81053384>] worker_thread+0x136/0x255
[<ffffffff810566da>] kthread+0x7d/0x85
[<ffffffff81003864>] kernel_thread_helper+0x4/0x10
irq event stamp: 559266
hardirqs last enabled at (559263): [<ffffffff811b40a5>] acpi_idle_enter_bm+0x236/0x275
hardirqs last disabled at (559264): [<ffffffff812ffd6a>] save_args+0x6a/0x70
softirqs last enabled at (559266): [<ffffffff81044bce>] _local_bh_enable+0xe/0x10
softirqs last disabled at (559265): [<ffffffff81044c0d>] irq_enter+0x3d/0x6f

other info that might help us debug this:
1 lock held by swapper/0:
#0: (&serio->lock){-.-...}, at: [<ffffffff81259018>] serio_interrupt+0x24/0x7e

stack backtrace:
Pid: 0, comm: swapper Tainted: G W 2.6.38-rc3-bisect-00062-g8c007a1 #92
Call Trace:
<IRQ> [<ffffffff81065442>] ? valid_state+0x17e/0x191
[<ffffffff8100e159>] ? save_stack_trace+0x28/0x45
[<ffffffff81065d18>] ? check_usage_forwards+0x0/0x87
[<ffffffff81065568>] ? mark_lock+0x113/0x22d
[<ffffffff8106687f>] ? __lock_acquire+0x2ac/0xd01
[<ffffffff81065482>] ? mark_lock+0x2d/0x22d
[<ffffffff81065482>] ? mark_lock+0x2d/0x22d
[<ffffffff81065482>] ? mark_lock+0x2d/0x22d
[<ffffffff81067338>] ? lock_acquire+0x64/0x81
[<ffffffff8125def6>] ? input_event+0x41/0x83
[<ffffffff812ff378>] ? _raw_spin_lock_irqsave+0x53/0x65
[<ffffffff8125def6>] ? input_event+0x41/0x83
[<ffffffff8125def6>] ? input_event+0x41/0x83
[<ffffffff81261fb6>] ? atkbd_interrupt+0x9a/0x574
[<ffffffff81259034>] ? serio_interrupt+0x40/0x7e
[<ffffffff8125a31c>] ? i8042_interrupt+0x28e/0x2a8
[<ffffffff8107e834>] ? handle_IRQ_event+0x20/0xae
[<ffffffff810805e6>] ? handle_edge_irq+0xde/0x12a
[<ffffffff810055c9>] ? handle_irq+0x1f/0x2a
[<ffffffff81004c37>] ? do_IRQ+0x48/0xaf
[<ffffffff812ffe13>] ? ret_from_intr+0x0/0x1a
<EOI> [<ffffffff8100a1aa>] ? native_sched_clock+0x2d/0x5f
[<ffffffff811b40ac>] ? acpi_idle_enter_bm+0x23d/0x275
[<ffffffff811b40a5>] ? acpi_idle_enter_bm+0x236/0x275
[<ffffffff8126c2ae>] ? cpuidle_idle_call+0x9f/0xd5
[<ffffffff81001e75>] ? cpu_idle+0x61/0xaa
[<ffffffff812ecdc9>] ? rest_init+0xcd/0xd4
[<ffffffff812eccfc>] ? rest_init+0x0/0xd4
[<ffffffff81682d3d>] ? start_kernel+0x3a6/0x3b1
[<ffffffff816822b1>] ? x86_64_start_reservations+0xb8/0xbc
[<ffffffff816823bb>] ? x86_64_start_kernel+0x106/0x115

The cause is that local_bh_enable() could call do_softirq()
in certain time, thus break irq status and make lock bite
us.

Reported-and-bisected-by: Nick Bowler <[email protected]>
Signed-off-by: Yong Zhang <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Thomas Gleixner <[email protected]>
---
kernel/timer.c | 7 ++++++-
1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/kernel/timer.c b/kernel/timer.c
index 43ca993..253aefe 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -969,10 +969,15 @@ EXPORT_SYMBOL(try_to_del_timer_sync);
int del_timer_sync(struct timer_list *timer)
{
#ifdef CONFIG_LOCKDEP
+ unsigned long flags;
+
local_bh_disable();
lock_map_acquire(&timer->lockdep_map);
lock_map_release(&timer->lockdep_map);
- local_bh_enable();
+ /* raw_local_irq_[save|restore] is to protect _local_bh_enable() */
+ raw_local_irq_save(flags);
+ _local_bh_enable();
+ raw_local_irq_restore(flags);
#endif
/*
* don't use it in hardirq context, because it
--
1.7.1

2011-02-03 09:29:44

by Peter Zijlstra

[permalink] [raw]
Subject: Re: Regression: WARNINGS and lockdep spews in 2.6.38-rc3+ (bisected).

On Thu, 2011-02-03 at 17:12 +0800, Yong Zhang wrote:
> #ifdef CONFIG_LOCKDEP
> + unsigned long flags;
> +
> local_bh_disable();
> lock_map_acquire(&timer->lockdep_map);
> lock_map_release(&timer->lockdep_map);
> - local_bh_enable();
> + /* raw_local_irq_[save|restore] is to protect _local_bh_enable() */
> + raw_local_irq_save(flags);
> + _local_bh_enable();
> + raw_local_irq_restore(flags);
> #endif

_why_ are you doing the raw_local_irq stuff? That's just weird, and that
comment isn't helping.

Also, calling _local_bh_enable() will leave pending softirqs, not
particularly nice

2011-02-03 10:17:48

by Yong Zhang

[permalink] [raw]
Subject: Re: Regression: WARNINGS and lockdep spews in 2.6.38-rc3+ (bisected).

On Thu, Feb 03, 2011 at 10:30:40AM +0100, Peter Zijlstra wrote:
> On Thu, 2011-02-03 at 17:12 +0800, Yong Zhang wrote:
> > #ifdef CONFIG_LOCKDEP
> > + unsigned long flags;
> > +
> > local_bh_disable();
> > lock_map_acquire(&timer->lockdep_map);
> > lock_map_release(&timer->lockdep_map);
> > - local_bh_enable();
> > + /* raw_local_irq_[save|restore] is to protect _local_bh_enable() */
> > + raw_local_irq_save(flags);
> > + _local_bh_enable();
> > + raw_local_irq_restore(flags);
> > #endif
>
> _why_ are you doing the raw_local_irq stuff? That's just weird, and that
> comment isn't helping.

Just to avoid trigger WARN_ON_ONCE(!irqs_disabled()) in __local_bh_enable(),

>
> Also, calling _local_bh_enable() will leave pending softirqs, not
> particularly nice

Ah, indeed. That's bad.

Now I'm not sure how to work around this issue, because local_bh_enable()
and _local_bh_enable() are not suitable here.

Maybe we can introduce another type of local_bh_enable() in which
waking up ksoftirqd is forced if needed.

Thanks,
Yong

2011-02-03 10:32:09

by Peter Zijlstra

[permalink] [raw]
Subject: Re: Regression: WARNINGS and lockdep spews in 2.6.38-rc3+ (bisected).


How about we revert your patch and go back to not allowing
del_timer_sync() from any interrupt context, we can fix __dst_free by
removing the need for cancel_delayed_work().

Are there any other del_timer_sync() callers from softirq?

---
include/linux/timer.h | 1 +
kernel/timer.c | 67 ++++++++++++++++++++++++++++++------------------
kernel/workqueue.c | 4 +-
net/core/dst.c | 1 -
4 files changed, 45 insertions(+), 28 deletions(-)

diff --git a/include/linux/timer.h b/include/linux/timer.h
index 6abd913..3e263f9 100644
--- a/include/linux/timer.h
+++ b/include/linux/timer.h
@@ -209,6 +209,7 @@ static inline int timer_pending(const struct timer_list * timer)
extern void add_timer_on(struct timer_list *timer, int cpu);
extern int del_timer(struct timer_list * timer);
extern int mod_timer(struct timer_list *timer, unsigned long expires);
+extern int mod_timer_on(struct timer_list *timer, unsigned long expires, int cpu);
extern int mod_timer_pending(struct timer_list *timer, unsigned long expires);
extern int mod_timer_pinned(struct timer_list *timer, unsigned long expires);

diff --git a/kernel/timer.c b/kernel/timer.c
index 87f656c..c9e98f7 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -648,8 +648,8 @@ static struct tvec_base *lock_timer_base(struct timer_list *timer,
}

static inline int
-__mod_timer(struct timer_list *timer, unsigned long expires,
- bool pending_only, int pinned)
+__mod_timer_on(struct timer_list *timer, unsigned long expires,
+ bool pending_only, int pinned, int cpu)
{
struct tvec_base *base, *new_base;
unsigned long flags;
@@ -673,12 +673,14 @@ __mod_timer(struct timer_list *timer, unsigned long expires,

debug_activate(timer, expires);

- cpu = smp_processor_id();
+ if (cpu == -1) {
+ cpu = smp_processor_id();

#if defined(CONFIG_NO_HZ) && defined(CONFIG_SMP)
- if (!pinned && get_sysctl_timer_migration() && idle_cpu(cpu))
- cpu = get_nohz_timer_target();
+ if (!pinned && get_sysctl_timer_migration() && idle_cpu(cpu))
+ cpu = get_nohz_timer_target();
#endif
+ }
new_base = per_cpu(tvec_bases, cpu);

if (base != new_base) {
@@ -705,12 +707,30 @@ __mod_timer(struct timer_list *timer, unsigned long expires,
base->next_timer = timer->expires;
internal_add_timer(base, timer);

+ if (cpu != smp_processor_id()) {
+ /*
+ * Check whether the other CPU is idle and needs to be
+ * triggered to reevaluate the timer wheel when nohz is
+ * active. We are protected against the other CPU fiddling
+ * with the timer by holding the timer base lock. This also
+ * makes sure that a CPU on the way to idle can not evaluate
+ * the timer wheel.
+ */
+ wake_up_idle_cpu(cpu);
+ }
out_unlock:
spin_unlock_irqrestore(&base->lock, flags);

return ret;
}

+static inline int
+__mod_timer(struct timer_list *timer, unsigned long expires,
+ bool pending_only, int pinned)
+{
+ return __mod_timer_on(timer, expires, pending_only, pinned, -1);
+}
+
/**
* mod_timer_pending - modify a pending timer's timeout
* @timer: the pending timer to be modified
@@ -825,6 +845,17 @@ int mod_timer_pinned(struct timer_list *timer, unsigned long expires)
}
EXPORT_SYMBOL(mod_timer_pinned);

+int mod_timer_on(struct timer_list *timer, unsigned long expires, int cpu)
+{
+ if (timer_pending(timer) && timer->expires == expires)
+ return 1;
+
+ expires = apply_slack(timer, expires);
+
+ return __mod_timer_on(timer, expires, false, TIMER_NOT_PINNED, cpu);
+}
+EXPORT_SYMBOL(mod_timer_on);
+
/**
* add_timer - start a timer
* @timer: the timer to be added
@@ -860,23 +891,7 @@ void add_timer_on(struct timer_list *timer, int cpu)

timer_stats_timer_set_start_info(timer);
BUG_ON(timer_pending(timer) || !timer->function);
- spin_lock_irqsave(&base->lock, flags);
- timer_set_base(timer, base);
- debug_activate(timer, timer->expires);
- if (time_before(timer->expires, base->next_timer) &&
- !tbase_get_deferrable(timer->base))
- base->next_timer = timer->expires;
- internal_add_timer(base, timer);
- /*
- * Check whether the other CPU is idle and needs to be
- * triggered to reevaluate the timer wheel when nohz is
- * active. We are protected against the other CPU fiddling
- * with the timer by holding the timer base lock. This also
- * makes sure that a CPU on the way to idle can not evaluate
- * the timer wheel.
- */
- wake_up_idle_cpu(cpu);
- spin_unlock_irqrestore(&base->lock, flags);
+ mod_timer_on(timer, timer->expires, cpu);
}
EXPORT_SYMBOL_GPL(add_timer_on);

@@ -959,7 +974,7 @@ EXPORT_SYMBOL(try_to_del_timer_sync);
*
* Synchronization rules: Callers must prevent restarting of the timer,
* otherwise this function is meaningless. It must not be called from
- * hardirq contexts. The caller must not hold locks which would prevent
+ * interrupt contexts. The caller must not hold locks which would prevent
* completion of the timer's handler. The timer's handler must not call
* add_timer_on(). Upon exit the timer is not queued and the handler is
* not running on any CPU.
@@ -969,10 +984,12 @@ EXPORT_SYMBOL(try_to_del_timer_sync);
int del_timer_sync(struct timer_list *timer)
{
#ifdef CONFIG_LOCKDEP
- local_bh_disable();
+ unsigned long flags;
+
+ local_irq_save(flags);
lock_map_acquire(&timer->lockdep_map);
lock_map_release(&timer->lockdep_map);
- local_bh_enable();
+ local_irq_restore(flags);
#endif
/*
* don't use it in hardirq context, because it
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 11869fa..3944acc 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1160,9 +1160,9 @@ int queue_delayed_work_on(int cpu, struct workqueue_struct *wq,
timer->function = delayed_work_timer_fn;

if (unlikely(cpu >= 0))
- add_timer_on(timer, cpu);
+ mod_timer_on(timer, cpu);
else
- add_timer(timer);
+ mod_timer(timer);
ret = 1;
}
return ret;
diff --git a/net/core/dst.c b/net/core/dst.c
index b99c7c7..a4f28bc 100644
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -207,7 +207,6 @@ void __dst_free(struct dst_entry *dst)
if (dst_garbage.timer_inc > DST_GC_INC) {
dst_garbage.timer_inc = DST_GC_INC;
dst_garbage.timer_expires = DST_GC_MIN;
- cancel_delayed_work(&dst_gc_work);
schedule_delayed_work(&dst_gc_work, dst_garbage.timer_expires);
}
spin_unlock_bh(&dst_garbage.lock);

2011-02-03 11:42:28

by Yong Zhang

[permalink] [raw]
Subject: Re: Regression: WARNINGS and lockdep spews in 2.6.38-rc3+ (bisected).

On Thu, Feb 03, 2011 at 11:33:04AM +0100, Peter Zijlstra wrote:
>
> How about we revert your patch and go back to not allowing
> del_timer_sync() from any interrupt context,

The fact is there is someone want to use del_timer_sync()
in softirq context.
My patch is inspired by this http://lkml.org/lkml/2010/8/16/291
though in the end that issue is resolved by not using
del_timer_sync() in softirq.

> we can fix __dst_free by
> removing the need for cancel_delayed_work().
>
> Are there any other del_timer_sync() callers from softirq?

I'm not sure about this.

IMHO, we don't need to make more restriction if we don't
have to.

Thanks,
Yong

2011-02-03 11:51:06

by Yong Zhang

[permalink] [raw]
Subject: [PATCH 1/2] softirq: introduce loacal_bh_enable_force_wake()

On Thu, Feb 03, 2011 at 06:17:39PM +0800, Yong Zhang wrote:
> Maybe we can introduce another type of local_bh_enable() in which
> waking up ksoftirqd is forced if needed.

---
From: Yong Zhang <[email protected]>
Subject: [PATCH 1/2] softirq: introduce loacal_bh_enable_force_wake()

If there is pending softirq, don't handle it in the caller's
context, invoke ksoftirqd directly instead.

del_timer_sync() will be the first caller.

Signed-off-by: Yong Zhang <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Andrew Morton <[email protected]>
---
include/linux/bottom_half.h | 1 +
kernel/softirq.c | 19 ++++++++++++++-----
2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/include/linux/bottom_half.h b/include/linux/bottom_half.h
index 27b1bcf..665d697 100644
--- a/include/linux/bottom_half.h
+++ b/include/linux/bottom_half.h
@@ -5,5 +5,6 @@ extern void local_bh_disable(void);
extern void _local_bh_enable(void);
extern void local_bh_enable(void);
extern void local_bh_enable_ip(unsigned long ip);
+extern void local_bh_enable_force_wake(void);

#endif /* _LINUX_BH_H */
diff --git a/kernel/softirq.c b/kernel/softirq.c
index 68eb5ef..5f224d4 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -154,7 +154,7 @@ void _local_bh_enable(void)

EXPORT_SYMBOL(_local_bh_enable);

-static inline void _local_bh_enable_ip(unsigned long ip)
+static inline void _local_bh_enable_ip(unsigned long ip, bool force_wake)
{
WARN_ON_ONCE(in_irq() || irqs_disabled());
#ifdef CONFIG_TRACE_IRQFLAGS
@@ -171,8 +171,12 @@ static inline void _local_bh_enable_ip(unsigned long ip)
*/
sub_preempt_count(SOFTIRQ_DISABLE_OFFSET - 1);

- if (unlikely(!in_interrupt() && local_softirq_pending()))
- do_softirq();
+ if (unlikely(!in_interrupt() && local_softirq_pending())) {
+ if (!force_wake)
+ do_softirq();
+ else
+ wakeup_softirqd();
+ }

dec_preempt_count();
#ifdef CONFIG_TRACE_IRQFLAGS
@@ -183,16 +187,21 @@ static inline void _local_bh_enable_ip(unsigned long ip)

void local_bh_enable(void)
{
- _local_bh_enable_ip((unsigned long)__builtin_return_address(0));
+ _local_bh_enable_ip((unsigned long)__builtin_return_address(0), false);
}
EXPORT_SYMBOL(local_bh_enable);

void local_bh_enable_ip(unsigned long ip)
{
- _local_bh_enable_ip(ip);
+ _local_bh_enable_ip(ip, false);
}
EXPORT_SYMBOL(local_bh_enable_ip);

+void local_bh_enable_force_wake(void)
+{
+ _local_bh_enable_ip((unsigned long)__builtin_return_address(0), true);
+}
+EXPORT_SYMBOL(local_bh_enable_force_wake);
/*
* We restart softirq processing MAX_SOFTIRQ_RESTART times,
* and we fall back to softirqd after that.
--
1.7.1

2011-02-03 11:54:09

by Yong Zhang

[permalink] [raw]
Subject: [PATCH 2/2] timer: use local_bh_enable_force_wake() in del_timer_sync()

From: Yong Zhang <[email protected]>
Subject: [PATCH 2/2] timer: use local_bh_enable_force_wake() in del_timer_sync()

Nick Bowler reported:
=================================
[ INFO: inconsistent lock state ]
2.6.38-rc3-bisect-00062-g8c007a1 #92
---------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
swapper/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
(&(&dev->event_lock)->rlock){?.+...}, at: [<ffffffff8125def6>] input_event+0x41/0x83
{HARDIRQ-ON-W} state was registered at:
[<ffffffff810656d4>] mark_held_locks+0x52/0x70
[<ffffffff8106593a>] trace_hardirqs_on_caller+0xf7/0x135
[<ffffffff81065985>] trace_hardirqs_on+0xd/0xf
[<ffffffff81044ac2>] __do_softirq+0x62/0x160
[<ffffffff8100395c>] call_softirq+0x1c/0x28
[<ffffffff81044cf2>] _local_bh_enable_ip+0xb3/0xe3
[<ffffffff81044d3a>] local_bh_enable+0xd/0xf
[<ffffffff8104a418>] del_timer_sync+0x4f/0x8e
[<ffffffffa00947fe>] rfkill_schedule_global_op+0x41/0x84 [rfkill]
[<ffffffffa0094858>] rfkill_schedule_evsw_rfkillall+0x17/0x19 [rfkill]
[<ffffffffa009495f>] rfkill_event+0x58/0x5a [rfkill]
[<ffffffff8125c980>] input_pass_event+0x97/0xea
[<ffffffff8125de13>] input_handle_event+0x460/0x46f
[<ffffffff8125df1b>] input_event+0x66/0x83
[<ffffffffa009becb>] tpacpi_send_radiosw_update+0x7f/0xc4 [thinkpad_acpi]
[<ffffffffa009c417>] hotkey_notify+0x507/0x581 [thinkpad_acpi]
[<ffffffffa009b029>] dispatch_acpi_notify+0x29/0x2b [thinkpad_acpi]
[<ffffffff8119a18a>] acpi_ev_notify_dispatch+0x62/0x7c
[<ffffffff8118b323>] acpi_os_execute_deferred+0x24/0x31
[<ffffffff810511bf>] process_one_work+0x204/0x33b
[<ffffffff81053384>] worker_thread+0x136/0x255
[<ffffffff810566da>] kthread+0x7d/0x85
[<ffffffff81003864>] kernel_thread_helper+0x4/0x10
irq event stamp: 559266
hardirqs last enabled at (559263): [<ffffffff811b40a5>] acpi_idle_enter_bm+0x236/0x275
hardirqs last disabled at (559264): [<ffffffff812ffd6a>] save_args+0x6a/0x70
softirqs last enabled at (559266): [<ffffffff81044bce>] _local_bh_enable+0xe/0x10
softirqs last disabled at (559265): [<ffffffff81044c0d>] irq_enter+0x3d/0x6f

other info that might help us debug this:
1 lock held by swapper/0:
#0: (&serio->lock){-.-...}, at: [<ffffffff81259018>] serio_interrupt+0x24/0x7e

stack backtrace:
Pid: 0, comm: swapper Tainted: G W 2.6.38-rc3-bisect-00062-g8c007a1 #92
Call Trace:
<IRQ> [<ffffffff81065442>] ? valid_state+0x17e/0x191
[<ffffffff8100e159>] ? save_stack_trace+0x28/0x45
[<ffffffff81065d18>] ? check_usage_forwards+0x0/0x87
[<ffffffff81065568>] ? mark_lock+0x113/0x22d
[<ffffffff8106687f>] ? __lock_acquire+0x2ac/0xd01
[<ffffffff81065482>] ? mark_lock+0x2d/0x22d
[<ffffffff81065482>] ? mark_lock+0x2d/0x22d
[<ffffffff81065482>] ? mark_lock+0x2d/0x22d
[<ffffffff81067338>] ? lock_acquire+0x64/0x81
[<ffffffff8125def6>] ? input_event+0x41/0x83
[<ffffffff812ff378>] ? _raw_spin_lock_irqsave+0x53/0x65
[<ffffffff8125def6>] ? input_event+0x41/0x83
[<ffffffff8125def6>] ? input_event+0x41/0x83
[<ffffffff81261fb6>] ? atkbd_interrupt+0x9a/0x574
[<ffffffff81259034>] ? serio_interrupt+0x40/0x7e
[<ffffffff8125a31c>] ? i8042_interrupt+0x28e/0x2a8
[<ffffffff8107e834>] ? handle_IRQ_event+0x20/0xae
[<ffffffff810805e6>] ? handle_edge_irq+0xde/0x12a
[<ffffffff810055c9>] ? handle_irq+0x1f/0x2a
[<ffffffff81004c37>] ? do_IRQ+0x48/0xaf
[<ffffffff812ffe13>] ? ret_from_intr+0x0/0x1a
<EOI> [<ffffffff8100a1aa>] ? native_sched_clock+0x2d/0x5f
[<ffffffff811b40ac>] ? acpi_idle_enter_bm+0x23d/0x275
[<ffffffff811b40a5>] ? acpi_idle_enter_bm+0x236/0x275
[<ffffffff8126c2ae>] ? cpuidle_idle_call+0x9f/0xd5
[<ffffffff81001e75>] ? cpu_idle+0x61/0xaa
[<ffffffff812ecdc9>] ? rest_init+0xcd/0xd4
[<ffffffff812eccfc>] ? rest_init+0x0/0xd4
[<ffffffff81682d3d>] ? start_kernel+0x3a6/0x3b1
[<ffffffff816822b1>] ? x86_64_start_reservations+0xb8/0xbc
[<ffffffff816823bb>] ? x86_64_start_kernel+0x106/0x115

The cause is that local_bh_enable() could call do_softirq()
in certain time, thus break irq status and make lock bite
us.

Reported-and-bisected-by: Nick Bowler <[email protected]>
Signed-off-by: Yong Zhang <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Thomas Gleixner <[email protected]>

---
kernel/timer.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/timer.c b/kernel/timer.c
index 43ca993..1654dc1 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -972,7 +972,7 @@ int del_timer_sync(struct timer_list *timer)
local_bh_disable();
lock_map_acquire(&timer->lockdep_map);
lock_map_release(&timer->lockdep_map);
- local_bh_enable();
+ local_bh_enable_force_wake();
#endif
/*
* don't use it in hardirq context, because it
--
1.7.1

2011-02-08 16:54:25

by Peter Zijlstra

[permalink] [raw]
Subject: Re: Regression: WARNINGS and lockdep spews in 2.6.38-rc3+ (bisected).

On Thu, 2011-02-03 at 19:42 +0800, Yong Zhang wrote:
> My patch is inspired by this http://lkml.org/lkml/2010/8/16/291

Right, so that splat warns about:

spin_lock_irqsave(&shost->host_lock);
del_timer_sync()
spin_lock_irqsave(&t->split_timeout_timer);

vs

<SOFTIRQ>
call_timer_fn()
spin_lock(&t->split_timeout_timer);
<IRQ>
spin_lock_irqsave(&shost->host_lock);

Which Steven just pointed out is an actual deadlock because:


CPU0 CPU1
spin_lock_irqsave(&shost->host_lock);
<SOFTIRQ>
call_timer_fn();
base->running_timer = timer;
func();
<IRQ>
spin_lock(&shost->host_lock);
del_timer_sync()
while(base->running_timer == timer);

And the world stops turning..


Also note that your patch changes the above to:

spin_lock_irqsave(&shost->host_lock);
del_timer_sync()
spin_lock_bh(&t->split_timeout_timer);

Which wouldn't make one bit of difference.

2011-02-08 17:39:57

by Steven Rostedt

[permalink] [raw]
Subject: [PATCH] lockdep/timers: Explain in detail the locking problems del_timer_sync() may cause

Twice I had to explain the output about why lockdep gives an error with
locks in IRQ context and with del_timer_sync(). Might as well write it
up and place it in the comments above the code in del_timer_sync().
Perhaps the next time this lockdep dump triggers people will understand
the issues.

It is a ticky issue and very subtle, explaining it in detail in the code
may help others understand the issue when they stumble upon the bug
again.

Signed-off-by: Steven Rostedt <[email protected]>

diff --git a/kernel/timer.c b/kernel/timer.c
index d53ce66..d748321 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -964,6 +964,25 @@ EXPORT_SYMBOL(try_to_del_timer_sync);
* add_timer_on(). Upon exit the timer is not queued and the handler is
* not running on any CPU.
*
+ * Note: You must not hold locks that are held in interrupt context
+ * while calling this function. Even if the lock has nothing to do
+ * with the timer in question. Here's why:
+ *
+ * CPU0 CPU1
+ * ---- ----
+ * <SOFTIRQ>
+ * call_timer_fn();
+ * base->running_timer = mytimer;
+ * spin_lock_irq(somelock);
+ * <IRQ>
+ * spin_lock(somelock);
+ * del_timer_sync(mytimer);
+ * while (base->running_timer == mytimer);
+ *
+ * Now del_timer_sync() will never return and never release somelock.
+ * The interrupt on the other CPU is waiting to grab somelock but
+ * it has interrupted the softirq that CPU0 is waiting to finish.
+ *
* The function returns whether it has deactivated a pending timer or not.
*/
int del_timer_sync(struct timer_list *timer)
@@ -971,6 +990,10 @@ int del_timer_sync(struct timer_list *timer)
#ifdef CONFIG_LOCKDEP
unsigned long flags;

+ /*
+ * If lockdep gives a backtrace here, please reference
+ * the synchronization rules above.
+ */
raw_local_irq_save(flags);
local_bh_disable();
lock_map_acquire(&timer->lockdep_map);

2011-02-16 13:51:30

by Steven Rostedt

[permalink] [raw]
Subject: [tip:core/locking] lockdep/timers: Explain in detail the locking problems del_timer_sync() may cause

Commit-ID: 48228f7b470a74b6469a250d2977a13128d8fe96
Gitweb: http://git.kernel.org/tip/48228f7b470a74b6469a250d2977a13128d8fe96
Author: Steven Rostedt <[email protected]>
AuthorDate: Tue, 8 Feb 2011 12:39:54 -0500
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 16 Feb 2011 13:35:08 +0100

lockdep/timers: Explain in detail the locking problems del_timer_sync() may cause

Twice I had to explain the output about why lockdep gives an error with
locks in IRQ context and with del_timer_sync(). Might as well write it
up and place it in the comments above the code in del_timer_sync().
Perhaps the next time this lockdep dump triggers people will understand
the issues.

It is a ticky issue and very subtle, explaining it in detail in the code
may help others understand the issue when they stumble upon the bug
again.

Signed-off-by: Steven Rostedt <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/timer.c | 23 +++++++++++++++++++++++
1 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/kernel/timer.c b/kernel/timer.c
index d645992..5f40c2e 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -964,6 +964,25 @@ EXPORT_SYMBOL(try_to_del_timer_sync);
* add_timer_on(). Upon exit the timer is not queued and the handler is
* not running on any CPU.
*
+ * Note: You must not hold locks that are held in interrupt context
+ * while calling this function. Even if the lock has nothing to do
+ * with the timer in question. Here's why:
+ *
+ * CPU0 CPU1
+ * ---- ----
+ * <SOFTIRQ>
+ * call_timer_fn();
+ * base->running_timer = mytimer;
+ * spin_lock_irq(somelock);
+ * <IRQ>
+ * spin_lock(somelock);
+ * del_timer_sync(mytimer);
+ * while (base->running_timer == mytimer);
+ *
+ * Now del_timer_sync() will never return and never release somelock.
+ * The interrupt on the other CPU is waiting to grab somelock but
+ * it has interrupted the softirq that CPU0 is waiting to finish.
+ *
* The function returns whether it has deactivated a pending timer or not.
*/
int del_timer_sync(struct timer_list *timer)
@@ -971,6 +990,10 @@ int del_timer_sync(struct timer_list *timer)
#ifdef CONFIG_LOCKDEP
unsigned long flags;

+ /*
+ * If lockdep gives a backtrace here, please reference
+ * the synchronization rules above.
+ */
local_irq_save(flags);
lock_map_acquire(&timer->lockdep_map);
lock_map_release(&timer->lockdep_map);