Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754402AbbGWWAM (ORCPT ); Thu, 23 Jul 2015 18:00:12 -0400 Received: from mail-lb0-f181.google.com ([209.85.217.181]:35226 "EHLO mail-lb0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754305AbbGWWAI (ORCPT ); Thu, 23 Jul 2015 18:00:08 -0400 MIME-Version: 1.0 In-Reply-To: References: <20150723212042.GN25159@twins.programming.kicks-ass.net> From: Andy Lutomirski Date: Thu, 23 Jul 2015 14:59:46 -0700 Message-ID: Subject: Re: Dealing with the NMI mess To: Linus Torvalds Cc: Peter Zijlstra , X86 ML , "linux-kernel@vger.kernel.org" , Willy Tarreau , Borislav Petkov , Thomas Gleixner , Steven Rostedt , Brian Gerst Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2325 Lines: 60 On Thu, Jul 23, 2015 at 2:54 PM, Linus Torvalds wrote: > On Thu, Jul 23, 2015 at 2:45 PM, Andy Lutomirski wrote: >> >> Or we just re-enable them on the way out of NMI (i.e. the very last >> thing we do in the NMI handler). I don't want to break regular >> userspace gdb when perf is running. > > I'd really prefer it if we don't touch NMI code in those kinds of > ways. The NMI code is fragile as hell. All the problems we have with > it is exactly due to "where is the boundary" issues. > > That's why I *don't* want NMI code to do magic crap. Anything that > says "disable this during this magic window" is broken. The problems > we've had are exactly about atomicity of the entry/exit conditions, > and there is no really good way to get them right. > > I'd be much happier with a _TIF_USER_WORK_MASK approach exactly > because it's so *obvious* that it's not a boundary condition. > > I dislike the "disable and re-enable dr7 in the NMI handler" exactly > because it smells like "we can only handle faults in _this_ region". > It may be true, but it's also what I want us to get away from. I'd > much rather have the "big picture" be that we can take faults anywhere > at all (*), and that none of the core code really cares. Then we "fix > up" user space. OK, new proposal: In do_debug, if we trip an instruction breakpoint while !user_mode(regs) && ((regs->flags & X86_EFLAGS_IF) == 0), then disarm *that breakpoint*. Why? It only looks at hardware state (dr6 and dr7), and it can't break gdb, because gdb can't set a breakpoint that will cause this problem. All the other variants of this either need cached state or break gdb watchpoints on stack variables with perf running. --Andy > > Linus > > (*) And yes, sysenter and not having a stack at all is very special, > and I think we will *always* have to have that magical special case of > the first few instructions there. But that's a separate hardware > limitation we can't get around. -- Andy Lutomirski AMA Capital Management, LLC -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/