Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752898AbbDFRYJ (ORCPT ); Mon, 6 Apr 2015 13:24:09 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:46664 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751130AbbDFRYG (ORCPT ); Mon, 6 Apr 2015 13:24:06 -0400 Date: Mon, 6 Apr 2015 12:23:33 -0500 From: Chris J Arges To: Linus Torvalds Cc: Ingo Molnar , Rafael David Tinoco , Peter Anvin , Jiang Liu , Peter Zijlstra , LKML , Jens Axboe , Frederic Weisbecker , Gema Gomez , the arch/x86 maintainers Subject: Re: smp_call_function_single lockups Message-ID: <20150406172332.GA14555@canonical.com> References: <20150331031536.GA9303@canonical.com> <20150331222327.GA12512@canonical.com> <20150401124336.GB12841@gmail.com> <20150401161047.GD12730@canonical.com> <551C6A48.9060805@canonical.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3543 Lines: 80 On Thu, Apr 02, 2015 at 10:31:50AM -0700, Linus Torvalds wrote: > On Wed, Apr 1, 2015 at 2:59 PM, Chris J Arges > wrote: > > > > It is worthwhile to do a 'bisect' to see where on average it takes > > longer to reproduce? Perhaps it will point to a relevant change, or it > > may be completely useless. > > It's likely to be an exercise in futility. "git bisect" is realyl bad > at "gray area" things, and when it's a question of "it takes hours or > days to reproduce", it's almost certainly not worth it. Not unless > there is some really clear cut-off that we can believably say "this > causes it to get much slower". And in this case, I don't think it's > that clear-cut. Judging by DaveJ's attempts at bisecting things, the > timing just changes. And the differences might be due to entirely > unrelated changes like cacheline alignment etc. > > So unless we find a real clear signature of the bug (I was hoping that > the ISR bit would be that sign), I don't think trying to bisect it > based on how quickly you can reproduce things is worthwhile. > > Linus > Linus, Ingo, I did some testing and found that at the following patch level, the issue was much, much more likely to reproduce within < 15m. commit b6b8a1451fc40412c57d10c94b62e22acab28f94 Author: Jan Kiszka Date: Fri Mar 7 20:03:12 2014 +0100 KVM: nVMX: Rework interception of IRQs and NMIs Move the check for leaving L2 on pending and intercepted IRQs or NMIs from the *_allowed handler into a dedicated callback. Invoke this callback at the relevant points before KVM checks if IRQs/NMIs can be injected. The callback has the task to switch from L2 to L1 if needed and inject the proper vmexit events. The rework fixes L2 wakeups from HLT and provides the foundation for preemption timer emulation. However, when the following patch was applied the average time to reproduction goes down greatly (the stress reproducer ran for hours without issue): commit 9242b5b60df8b13b469bc6b7be08ff6ebb551ad3 Author: Bandan Das Date: Tue Jul 8 00:30:23 2014 -0400 KVM: x86: Check for nested events if there is an injectable interrupt With commit b6b8a1451fc40412c57d1 that introduced vmx_check_nested_events, checks for injectable interrupts happen at different points in time for L1 and L2 that could potentially cause a race. The regression occurs because KVM_REQ_EVENT is always set when nested_run_pending is set even if there's no pending interrupt. Consequently, there could be a small window when check_nested_events returns without exiting to L1, but an interrupt comes through soon after and it incorrectly, gets injected to L2 by inject_pending_event Fix this by adding a call to check for nested events too when a check for injectable interrupt returns true However we reproduced with v3.19 (containing these two patches) which did eventually softlockup with a similar backtrace. So far, this agrees with the current understanding that we may be not ACK'ing certain interrupts (IPIs from the L1 guest) causing csd_lock_wait to spin and causing the soft lockup. Hopefully this helps shed more light on this issue. Thanks, --chris j arges -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/