Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753240AbbLQAbJ (ORCPT ); Wed, 16 Dec 2015 19:31:09 -0500 Received: from mail-ig0-f196.google.com ([209.85.213.196]:36817 "EHLO mail-ig0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750990AbbLQAbH (ORCPT ); Wed, 16 Dec 2015 19:31:07 -0500 MIME-Version: 1.0 In-Reply-To: References: Date: Wed, 16 Dec 2015 17:31:06 -0700 Message-ID: Subject: Re: 4.4-rc5 Setting hardware breakpoint in int_ret_from_sys_call causes triple fault/reboot From: Jeff Merkey To: Andy Lutomirski Cc: Peter Zijlstra , "H. Peter Anvin" , X86 ML , Thomas Gleixner , LKML , Ingo Molnar Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2493 Lines: 58 On 12/16/15, Andy Lutomirski wrote: > On Dec 16, 2015 3:12 PM, "Jeff Merkey" wrote: >> >> Setting a hardware breakpoint at the >> >> rex64 sysret >> >> instruction at the end of int_ret_from_sys_call causes the system to >> triple fault >> and reboot when the breakpoint is triggered. Appears to be related >> the same problem >> as the lockup. >> >> This function can be stepped over and traced through with the TRAP >> FLAG set so long as a hardware breakpoint is set somewhere in the >> function. Otherwise upon exist the system hard hangs. If you break >> exactly on that instruction -- reboot. If you break a few >> instructions before it and single step through the call it works. If >> you step through the call with no breakpoint the system hard hangs. >> Same behavior as when you try to step from inside an nmi handler. >> Looks related. > > You're probably encountering the user mode RSP when SYSRET happens. > > --Andy > Hi Andy, Could be, but I am getting a double fault message with an error code of 0 that then scrolls off the screen when the triple fault hits. It flashes too quickly to get the function address -- wish I had a logic analyzer with an inverse assembler -- would already be there. A usermode RSP would I assume clear TRAP flag and that does not explain why it works if I set a breakpoint right above the instruction then step over it, which I can without the triple fault. Easy to reproduce, download the mdb debugger for 4.3.3 and apply it to 4.4-rc5, modprobe mdb, echo a > /proc/sysrq_trigger, u int_ret_from_syscall (scroll til you get to the swapgs then rex64 sysret, set a hardware breakpoint at that address , i.e. b ffffffff81673ae1 (or whatever address the swapgs instruction is at), then step through with t a few times (should just return after rex64 sysret since it returns to user space). The set a breakpoint at the rex64 sysret instruction, b
, let it break at the instruction, then hit g for go and watch the fireworks -- it will try to print a double fault message then reboot. I handle the whole user RSP thing, I just return if I see regs set to user space. This looks like some sort of problem in the exception handlers. Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/