Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752521Ab3FFVaS (ORCPT ); Thu, 6 Jun 2013 17:30:18 -0400 Received: from mail.candelatech.com ([208.74.158.172]:47442 "EHLO ns3.lanforge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751115Ab3FFVaQ (ORCPT ); Thu, 6 Jun 2013 17:30:16 -0400 From: greearb@candelatech.com To: linux-kernel@vger.kernel.org Cc: eric.dumazet@gmail.com, tj@kernel.org, stable@vger.kernel.org, torvalds@linux-foundation.org, Ben Greear Subject: [PATCH v3] Fix lockup related to stop_machine being stuck in __do_softirq. Date: Thu, 6 Jun 2013 14:29:49 -0700 Message-Id: <1370554189-31432-1-git-send-email-greearb@candelatech.com> X-Mailer: git-send-email 1.7.3.4 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 8328 Lines: 180 From: Ben Greear The stop machine logic can lock up if all but one of the migration threads make it through the disable-irq step and the one remaining thread gets stuck in __do_softirq. The reason __do_softirq can hang is that it has a bail-out based on jiffies timeout, but in the lockup case, jiffies itself is not incremented. To work around this, re-add the max_restart counter in __do_irq and stop processing irqs after 10 restarts. Thanks to Tejun Heo and Rusty Russell and others for helping me track this down. This was introduced in 3.9 by commit: c10d73671ad30f5469 (softirq: reduce latencies). It may be worth looking into ath9k to see if it has issues with it's irq handler at a later date. The hang stack traces look something like this: ------------[ cut here ]------------ WARNING: at kernel/watchdog.c:245 watchdog_overflow_callback+0x9c/0xa7() Hardware name: To be filled by O.E.M. Watchdog detected hard LOCKUP on cpu 2 Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc] Pid: 23, comm: migration/2 Tainted: G C 3.9.4+ #11 Call Trace: [] warn_slowpath_common+0x85/0x9f [] warn_slowpath_fmt+0x46/0x48 [] watchdog_overflow_callback+0x9c/0xa7 [] __perf_event_overflow+0x137/0x1cb [] ? x86_perf_event_set_period+0x103/0x10f [] perf_event_overflow+0x14/0x16 [] intel_pmu_handle_irq+0x2dc/0x359 [] perf_event_nmi_handler+0x19/0x1b [] nmi_handle+0x7f/0xc2 [] ? oops_begin+0xa9/0xa9 [] do_nmi+0xbc/0x304 [] end_repeat_nmi+0x1e/0x2e [] ? vprintk_emit+0x40a/0x444 [] ? stop_machine_cpu_stop+0xd8/0x274 [] ? stop_machine_cpu_stop+0xd8/0x274 [] ? stop_machine_cpu_stop+0xd8/0x274 <> [] ? copy_module_from_fd+0xe7/0xe7 [] ? copy_module_from_fd+0xe7/0xe7 [] ? copy_module_from_fd+0xe7/0xe7 [] ? stop_one_cpu_nowait+0x30/0x30 [] cpu_stopper_thread+0xae/0x162 [] ? __schedule+0x5ef/0x637 [] ? _raw_spin_unlock_irqrestore+0x47/0x7e [] ? trace_hardirqs_on_caller+0x123/0x15a [] ? trace_hardirqs_on+0xd/0xf [] ? _raw_spin_unlock_irqrestore+0x70/0x7e [] smpboot_thread_fn+0x258/0x260 [] ? test_ti_thread_flag.clone.0+0x11/0x11 [] kthread+0xc7/0xcf [] ? __init_kthread_worker+0x5b/0x5b [] ret_from_fork+0x7c/0xb0 [] ? __init_kthread_worker+0x5b/0x5b ---[ end trace 4947dfa9b0a4cec3 ]--- BUG: soft lockup - CPU#1 stuck for 22s! [migration/1:17] Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc] irq event stamp: 835637905 hardirqs last enabled at (835637904): [] __do_softirq+0x9f/0x257 hardirqs last disabled at (835637905): [] apic_timer_interrupt+0x6d/0x80 softirqs last enabled at (5654720): [] __do_softirq+0x1ff/0x257 softirqs last disabled at (5654725): [] irq_exit+0x5f/0xbb CPU 1 Pid: 17, comm: migration/1 Tainted: G WC 3.9.4+ #11 To be filled by O.E.M. To be filled by O.E.M./To be filled by O.E.M. RIP: 0010:[] [] tasklet_hi_action+0xf0/0xf0 RSP: 0018:ffff88022bc83ef0 EFLAGS: 00000212 RAX: 0000000000000006 RBX: ffff880217deb710 RCX: 0000000000000006 RDX: 0000000000000006 RSI: 0000000000000000 RDI: ffffffff81a050b0 RBP: ffff88022bc83f78 R08: ffffffff81a050b0 R09: ffff88022bc83cc8 R10: 00000000000005f2 R11: ffff8802203aaf50 R12: ffff88022bc83e68 R13: ffffffff815f48b2 R14: ffff88022bc83f78 R15: ffff88022230e000 FS: 0000000000000000(0000) GS:ffff88022bc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000430070 CR3: 00000001cbc5d000 CR4: 00000000000007e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process migration/1 (pid: 17, threadinfo ffff88022230e000, task ffff8802223142c0) Stack: ffffffff8109f539 ffff88022bc83f08 ffff88022230e010 042080402bc83f88 000000010021bfcd 000000012bc83fa8 ffff88022230e000 ffff88022230ffd8 0000000000000030 ffff880200000006 00000248d8cdab1c 1304da35fe841722 Call Trace: [] ? __do_softirq+0x117/0x257 [] irq_exit+0x5f/0xbb [] smp_apic_timer_interrupt+0x8a/0x98 [] apic_timer_interrupt+0x72/0x80 [] ? vprintk_emit+0x417/0x444 [] printk+0x4d/0x4f [] ? cpu_stopper_thread+0x57/0x162 [] stop_machine_cpu_stop+0x22c/0x274 [] ? copy_module_from_fd+0xe7/0xe7 [] ? copy_module_from_fd+0xe7/0xe7 [] ? copy_module_from_fd+0xe7/0xe7 [] ? stop_one_cpu_nowait+0x30/0x30 [] cpu_stopper_thread+0xae/0x162 [] ? __schedule+0x5ef/0x637 [] ? _raw_spin_unlock_irqrestore+0x47/0x7e [] ? trace_hardirqs_on_caller+0x123/0x15a [] ? trace_hardirqs_on+0xd/0xf [] ? _raw_spin_unlock_irqrestore+0x70/0x7e [] smpboot_thread_fn+0x258/0x260 [] ? test_ti_thread_flag.clone.0+0x11/0x11 [] kthread+0xc7/0xcf [] ? __init_kthread_worker+0x5b/0x5b [] ret_from_fork+0x7c/0xb0 [] ? __init_kthread_worker+0x5b/0x5b Code: 1c 25 18 e2 00 00 e8 cd fe ff ff e8 ac a4 04 00 fb 66 66 90 66 66 90 4c 89 e3 48 85 db 0f 85 79 ff ff ff 5f 5b 41 5c 41 5d c9 c3 <55> 48 89 e5 41 55 41 54 53 4 Signed-off-by: Ben Greear --- v3: More comments to explain why we need the max-softirq-restart limit as well as the time limit. kernel/softirq.c | 13 ++++++++++--- 1 files changed, 10 insertions(+), 3 deletions(-) diff --git a/kernel/softirq.c b/kernel/softirq.c index 14d7758..d93dcb1 100644 --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -195,8 +195,12 @@ void local_bh_enable_ip(unsigned long ip) EXPORT_SYMBOL(local_bh_enable_ip); /* - * We restart softirq processing for at most 2 ms, - * and if need_resched() is not set. + * We restart softirq processing for at most MAX_SOFTIRQ_RESTART times, + * but break the loop if need_resched() is set or after 2 ms. + * The MAX_SOFTIRQ_TIME provides a nice upper bound in most cases, but in + * certain cases, such as stop_machine(), jiffies may cease to + * increment and so we need the MAX_SOFTIRQ_RESTART limit as + * well to make sure we eventually return from this method. * * These limits have been established via experimentation. * The two things to balance is latency against fairness - @@ -204,6 +208,7 @@ EXPORT_SYMBOL(local_bh_enable_ip); * should not be able to lock up the box. */ #define MAX_SOFTIRQ_TIME msecs_to_jiffies(2) +#define MAX_SOFTIRQ_RESTART 10 asmlinkage void __do_softirq(void) { @@ -212,6 +217,7 @@ asmlinkage void __do_softirq(void) unsigned long end = jiffies + MAX_SOFTIRQ_TIME; int cpu; unsigned long old_flags = current->flags; + int max_restart = MAX_SOFTIRQ_RESTART; /* * Mask out PF_MEMALLOC s current task context is borrowed for the @@ -265,7 +271,8 @@ restart: pending = local_softirq_pending(); if (pending) { - if (time_before(jiffies, end) && !need_resched()) + if (time_before(jiffies, end) && !need_resched() && + --max_restart) goto restart; wakeup_softirqd(); -- 1.7.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/