Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754261AbbGWVRz (ORCPT ); Thu, 23 Jul 2015 17:17:55 -0400 Received: from casper.infradead.org ([85.118.1.10]:59002 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753716AbbGWVRx (ORCPT ); Thu, 23 Jul 2015 17:17:53 -0400 Date: Thu, 23 Jul 2015 23:17:44 +0200 From: Peter Zijlstra To: Andy Lutomirski Cc: X86 ML , "linux-kernel@vger.kernel.org" , Willy Tarreau , Borislav Petkov , Thomas Gleixner , Linus Torvalds , Steven Rostedt , Brian Gerst Subject: Re: Dealing with the NMI mess Message-ID: <20150723211744.GM25159@twins.programming.kicks-ass.net> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1866 Lines: 46 On Thu, Jul 23, 2015 at 01:21:16PM -0700, Andy Lutomirski wrote: > 3. Forbid faults (other than MCE) inside NMI. > > Option 3 is almost easy. There are really only two kinds of faults > that can legitimately nest inside NMI: #PF and #DB. #DB is easy to > fix (e.g. with my patches or Peter's patches). > > What if we went all out and forbade page faults in NMI as well. There > are two reasons that I can think of that we might page fault inside an > NMI: > > b) User memory access faults. > > The reason we access user state in general from an NMI is to allow > perf to capture enough user stack data to let the tooling backtrace > back to user space. What if we did it differently? Instead of > capturing this data in NMI context, capture it in > prepare_exit_to_usermode. > Peter, can this be done without breaking the perf ABI? If we were > designing all of this stuff from scratch right now, I'd suggest doing > it this way, but I'm not sure whether it makes sense to try to > retrofit it in. Not really; but also almost :/ So the thing is that we currently attach the user backtrace to all events -- and there can be many before we return to userspace again. So none of those events would have a userspace stack, I'm sure that's going to confuse the tooling. OTOH, userspace stacks are a best effort thing, we bail at the first sign of trouble (eg. the stack page is not there). Now realistically this 'never' happens, and it would result in consistently truncated user traces, where your proposal would result in a whole bunch of events with no user traces and then an 'extra' event with a one. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/