Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753813AbbLKIFd (ORCPT ); Fri, 11 Dec 2015 03:05:33 -0500 Received: from mail-ig0-f193.google.com ([209.85.213.193]:34513 "EHLO mail-ig0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753157AbbLKIFb (ORCPT ); Fri, 11 Dec 2015 03:05:31 -0500 MIME-Version: 1.0 In-Reply-To: References: Date: Fri, 11 Dec 2015 01:05:31 -0700 Message-ID: Subject: Re: [PATCH v2 1/1] Fix int1 recursion when no perf_bp_event is registeredy From: Jeff Merkey To: Thomas Gleixner Cc: Andy Lutomirski , LKML , Ingo Molnar , "H. Peter Anvin" , X86 ML , Peter Zijlstra , Andy Lutomirski , Masami Hiramatsu , Steven Rostedt , Borislav Petkov , Jiri Olsa Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7133 Lines: 211 I have completed testing of the following patch, added the fix and log a message when the error occurs. I also reviewed the code paths that touch this section of code. The lazy debug register behaviors pass state back and forth with this subsystem through the thread.debugred6 virtualized register value to determine whether or not another notify_die handler has been registered. The case that causes the lockup is when NOTIFY_STOP is set which halts calls to other handlers and also bypasses the checks at the bottom of the do_debug() function in traps.c and the bp event handler is NULL. In this case, the int1 exception has no code path to set the resume flag absent this patch. There are some complex behaviors in this code path that require careful review. I used the following code to test this change to the kernel compiled as a module. It's a pretty robust test since it sets breakpoints at interrupt contexts and process context. Unloading the module clears the breakpoints. I can send this test module as a separate patch if requested to verify. Test Module to test this patch: #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "linux/kallsyms.h" #define BREAK_EXECUTE 0 #define BREAK_WRITE 1 #define BREAK_IOPORT 2 #define BREAK_READWRITE 3 #define ONE_BYTE_FIELD 0 #define TWO_BYTE_FIELD 1 #define EIGHT_BYTE_FIELD 2 #define FOUR_BYTE_FIELD 3 /* DR7 Register */ #define L0_BIT 0x00000001 #define G0_BIT 0x00000002 #define L1_BIT 0x00000004 #define G1_BIT 0x00000008 #define L2_BIT 0x00000010 #define G2_BIT 0x00000020 #define L3_BIT 0x00000040 #define G3_BIT 0x00000080 #define LEXACT 0x00000100 #define GEXACT 0x00000200 #define GDETECT 0x00002000 #define DR7DEF 0x00000400 static int __init init_leaf(void) { unsigned long dr7 = 0; printk("0: %lX\n", kallsyms_lookup_name("reschedule_interrupt")); printk("1: %lX\n", kallsyms_lookup_name("apic_timer_interrupt")); printk("2: %lX\n", kallsyms_lookup_name("schedule")); printk("3: %lX\n", kallsyms_lookup_name("schedule")); set_debugreg(kallsyms_lookup_name("reschedule_interrupt"), 0); set_debugreg(kallsyms_lookup_name("apic_timer_interrupt"), 1); set_debugreg(kallsyms_lookup_name("schedule"), 2); set_debugreg(kallsyms_lookup_name("schedule"), 3); dr7 &= 0xFFF0FFFF; dr7 |= G0_BIT; dr7 |= ((BREAK_EXECUTE << ((0 * 4) + 16)) | (ONE_BYTE_FIELD << ((0 * 4) + 18))); dr7 &= 0xFF0FFFFF; dr7 |= G1_BIT; dr7 |= ((BREAK_EXECUTE << ((1 * 4) + 16)) | (ONE_BYTE_FIELD << ((1 * 4) + 18))); dr7 &= 0xF0FFFFFF; dr7 |= G2_BIT; dr7 |= ((BREAK_EXECUTE << ((2 * 4) + 16)) | (ONE_BYTE_FIELD << ((2 * 4) + 18))); dr7 &= 0x0FFFFFFF; dr7 |= G3_BIT; dr7 |= ((BREAK_READWRITE << ((3 * 4) + 16)) | (ONE_BYTE_FIELD << ((3 * 4) + 18))); set_debugreg(dr7, 7); return 0; } static void __exit cleanup_leaf(void) { set_debugreg(0L, 7); return; } module_init(init_leaf); module_exit(cleanup_leaf); MODULE_LICENSE("GPL"); The log output when the breakpoints are triggered on a 4 core system is as expected. I did not put code in that will clear the breakpoints automatically when this condition is detected. This is an item to discuss. [ 852.228301] hw_breakpoint_handler: 8816 callbacks suppressed [ 852.228314] INFO: spurious INT1 exception dr6: 0x1 dr7: 0x300004AA [ 852.228336] INFO: spurious INT1 exception dr6: 0x4 dr7: 0x300004AA [ 852.228366] INFO: spurious INT1 exception dr6: 0x4 dr7: 0x300004AA [ 852.228883] INFO: spurious INT1 exception dr6: 0x1 dr7: 0x300004AA [ 852.228887] INFO: spurious INT1 exception dr6: 0x4 dr7: 0x300004AA [ 852.228973] INFO: spurious INT1 exception dr6: 0x4 dr7: 0x300004AA [ 852.229250] INFO: spurious INT1 exception dr6: 0x2 dr7: 0x300004AA [ 852.229274] INFO: spurious INT1 exception dr6: 0x4 dr7: 0x300004AA [ 852.229546] INFO: spurious INT1 exception dr6: 0x4 dr7: 0x300004AA [ 852.230260] INFO: spurious INT1 exception dr6: 0x2 dr7: 0x300004AA [ 857.285756] hw_breakpoint_handler: 3381 callbacks suppressed [ 857.285764] INFO: spurious INT1 exception dr6: 0x1 dr7: 0x300004AA [ 857.285780] INFO: spurious INT1 exception dr6: 0x4 dr7: 0x300004AA Open Items: Should we also disable these breakpoints by changing DR7 if they are spurious or just throw up a lot of messages? Jeff Signed-off-by: Jeff V. Merkey diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c index 50a3fad..1613a61 100644 --- a/arch/x86/kernel/hw_breakpoint.c +++ b/arch/x86/kernel/hw_breakpoint.c @@ -444,7 +444,7 @@ EXPORT_SYMBOL_GPL(hw_breakpoint_restore); static int hw_breakpoint_handler(struct die_args *args) { int i, cpu, rc = NOTIFY_STOP; - struct perf_event *bp; + struct perf_event *bp = NULL; unsigned long dr7, dr6; unsigned long *dr6_p; @@ -475,6 +475,13 @@ static int hw_breakpoint_handler(struct die_args *args) for (i = 0; i < HBP_NUM; ++i) { if (likely(!(dr6 & (DR_TRAP0 << i)))) continue; + /* + * check if we got an execute breakpoint + * from the dr7 register. if we did, set + * the resume flag to avoid int1 recursion. + */ + if ((dr7 & (3 << ((i * 4) + 16))) == 0) + args->regs->flags |= X86_EFLAGS_RF; /* * The counter may be concurrently released but that can only @@ -503,7 +510,9 @@ static int hw_breakpoint_handler(struct die_args *args) /* * Set up resume flag to avoid breakpoint recursion when - * returning back to origin. + * returning back to origin. Perform the check + * twice in case the event handler altered the + * system flags. */ if (bp->hw.info.type == X86_BREAKPOINT_EXECUTE) args->regs->flags |= X86_EFLAGS_RF; @@ -519,6 +528,18 @@ static int hw_breakpoint_handler(struct die_args *args) (dr6 & (~DR_TRAP_BITS))) rc = NOTIFY_DONE; + /* + * if we are about to signal to + * do_debug() to stop further processing + * and we have not ascertained the source + * of the breakpoint, log it as spurious. + */ + if (rc == NOTIFY_STOP && !bp) { + printk_ratelimited(KERN_INFO + "INFO: spurious INT1 exception dr6: 0x%lX dr7: 0x%lX\n", + dr6, dr7); + } + set_debugreg(dr7, 7); put_cpu(); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/