Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755038AbcCWPgM (ORCPT ); Wed, 23 Mar 2016 11:36:12 -0400 Received: from e36.co.us.ibm.com ([32.97.110.154]:55584 "EHLO e36.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750996AbcCWPgK (ORCPT ); Wed, 23 Mar 2016 11:36:10 -0400 X-IBM-Helo: d03dlp03.boulder.ibm.com X-IBM-MailFrom: paulmck@linux.vnet.ibm.com X-IBM-RcptTo: linux-kernel@vger.kernel.org Date: Wed, 23 Mar 2016 08:36:09 -0700 From: "Paul E. McKenney" To: Mike Galbraith Cc: Ion Badulescu , linux-kernel@vger.kernel.org Subject: Re: rcu stalls and soft lockups with recent kernels Message-ID: <20160323153609.GB4287@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <56B39F0C.7070406@badula.org> <56B3A2A8.6000302@badula.org> <56E9868E.4000006@badula.org> <1458268137.3859.5.camel@gmail.com> <56F1A995.4000006@badula.org> <1458708614.14170.11.camel@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1458708614.14170.11.camel@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16032315-0021-0000-0000-00001B02557C Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5236 Lines: 110 On Wed, Mar 23, 2016 at 05:50:14AM +0100, Mike Galbraith wrote: > (cc) > > On Tue, 2016-03-22 at 16:22 -0400, Ion Badulescu wrote: > > On 03/17/2016 10:28 PM, Mike Galbraith wrote: > > > On Wed, 2016-03-16 at 12:15 -0400, Ion Badulescu wrote: > > > > Just following up to my own email: > > > > > > > > It turns out that we can eliminate the RCU stalls by changing from > > > > CONFIG_RCU_NOCB_CPU_ALL to CONFIG_RCU_NOCB_CPU_NONE. Letting each cpu > > > > handle its own RCU callbacks completely fixes the problems for us. > > > > > > > > Now, CONFIG_NO_HZ_FULL and CONFIG_RCU_NOCB_CPU_ALL is the default config > > > > for fedora and rhel7. Ho-humm... > > > > > > All RCU offloaded to CPU0 of a big box seems like a very bad idea. > > > > > > It's not offloaded to CPU0, is it? Those rcuo* threads are not cpu-bound > > and can run on any cpu the scheduler will put them on. In any case, > > there was no indication that the rcuo* threads wanted to run but > > couldn't get cpu time. > > Right, my thinker was screwed on cross-threaded. > > > Anyway, looks like I spoke too soon. It's less often with > > RCU_NOCB_CPU_NONE than with RCU_NOCB_CPU_ALL, but the soft lockups and > > rcu stalls are still happening. > > > > [44206.316711] clocksource: timekeeping watchdog: Marking clocksource > > 'tsc' as unstable because the skew is too large: > > [44206.328463] clocksource: 'hpet' wd_now: > > ffffffff wd_last: 5f03cdca mask: ffffffff > > [44206.339037] clocksource: 'tsc' cs_now: > > 64788b443c3a cs_last: 647840eea919 mask: ffffffffffffffff > > [44206.351253] clocksource: Switched to clocksource hpet > > [44922.301452] INFO: rcu_sched detected stalls on CPUs/tasks: > > [44922.307644] 0-...: (1 GPs behind) idle=53d/140000000000001/0 > > softirq=8515474/8515477 fqs=6994 > > [44922.317435] (detected by 1, t=21019 jiffies, g=2011397, c=2011396, > > q=3263) > > [44922.325274] Task dump for CPU 0: > > [44922.325276] python R running task 0 257113 257112 > > 0x00080088 0 FAIR 1 0 152294 48373 > > [44922.325283] ffffffff8152ca8e ffff881b76870000 ffff880e83669000 > > 0000000000007d54 > > [44922.333671] ffff881b1cdc7a48 ffff880a58a57e58 0000000000000086 > > 0000000000000000 > > [44922.342060] 0000000000000000 0000000000003fa1 ffff880a58a54000 > > ffff880a58a57e88 > > [44922.350446] Call Trace: > > [44922.353215] [] ? __schedule+0x38e/0xa90 > > [44922.359388] [] ? rcu_eqs_enter_common+0x66/0x130 > > [44922.366437] [] ? acct_account_cputime+0x1c/0x20 > > [44922.373388] [] ? account_user_time+0x78/0x80 > > [44922.380045] [] ? vtime_account_user+0x43/0x60 > > [44922.386801] [] ? __context_tracking_exit+0x70/0xc0 > > [44922.394044] [] ? enter_from_user_mode+0x1f/0x50 > > [44922.400994] [] ? apic_timer_interrupt+0x69/0x90 > > [44923.210453] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 21s! > > [python:257113] > > [44923.218890] Modules linked in: nfsv3 nfs_acl nfs msr autofs4 lockd > > grace sunrpc cachefiles fscache binfmt_misc nls_iso8859_1 nls_cp437 vfat > > fat vhost_net vhost tun kvm irqbypass input_leds hid_generic iTCO_wdt > > iTCO_vendor_support pcspkr sfc mtd i2c_algo_bit sb_edac sg l > > pc_ich mfd_core ehci_pci ehci_hcd xhci_pci xhci_hcd i2c_i801 i2c_core > > ixgbe ptp pps_core mdio ipmi_devintf ipmi_si ipmi_msghandler tpm_tis tpm > > acpi_power_meter hwmon ext4 jbd2 mbcache crc16 raid1 dm_mirror > > dm_region_hash dm_log dm_mod > > [44923.269289] CPU: 0 PID: 257113 Comm: python Tainted: G I > > 4.4.5-el6.ia32e.lime.0 #1 > > [44923.279089] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS > > SE5C610.86B.01.01.0011.081020151200 08/10/2015 > > [44923.290824] task: ffff880e83669000 ti: ffff880a58a54000 task.ti: > > ffff880a58a54000 > > [44923.299253] RIP: 0033:[<00002b5ee4e0502d>] [<00002b5ee4e0502d>] > > 0x2b5ee4e0502d > > [44923.307508] RSP: 002b:00007ffe4120b170 EFLAGS: 00000212 > > [44923.313494] RAX: 0000000000000008 RBX: 0000000003fe1480 RCX: > > 0000000003fc9b00 > > [44923.321513] RDX: 00002b5ee34b4260 RSI: 0000000000002248 RDI: > > 00000000000000d4 > > [44923.329530] RBP: 0000000003fe1800 R08: 0000000000000078 R09: > > 0000000000000800 > > [44923.337548] R10: 00007ffe4120b550 R11: 0000000000011240 R12: > > 00002b5ee34ba938 > > [44923.345566] R13: 00002b5ee34c1010 R14: 00007ffe4120b428 R15: > > 0000000000000400 > > [44923.353580] FS: 00002b5ec7fe98c0(0000) GS:ffff88103fc00000(0000) > > knlGS:0000000000000000 > > [44923.362689] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [44923.369147] CR2: 00002b5ee4190000 CR3: 00000005ac81b000 CR4: > > 00000000001406f0 > > > > That rcu_eqs_enter_common function seems to be a fairly common > > occurrence in these stack traces. Not sure if it means anything, though. > > > > Also, this seems to be a sock lockup with RIP in userspace. Does it mean > > timer interrupts are disabled? Somehow it fails to reschedule the NMI timer. > > > > We're at our wits' end here... > > > > This, btw, is a 2x12 core Haswell box. This one would definitely make Salvador Dali proud! All I can suggest is to reconfigure to get the highest error rate, then bisect. Thanx, Paul