Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752327AbbHRIwJ (ORCPT ); Tue, 18 Aug 2015 04:52:09 -0400 Received: from mx2.suse.de ([195.135.220.15]:44559 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751143AbbHRIwF (ORCPT ); Tue, 18 Aug 2015 04:52:05 -0400 Date: Tue, 18 Aug 2015 10:52:11 +0200 From: Borislav Petkov To: Andy Lutomirski Cc: Jiri Olsa , "linux-kernel@vger.kernel.org" , X86 ML , Peter Zijlstra , Ingo Molnar , Robert Richter , "H. Peter Anvin" , Thomas Gleixner , Arnaldo Carvalho de Melo , Namhyung Kim , Jan Stancek Subject: Re: [BUG/RFC] perf test fails on AMD CPUs Message-ID: <20150818085211.GA28738@nazgul.tnic> References: <20150816222956.GA14290@krava.brq.redhat.com> <20150817043603.GB9387@nazgul.tnic> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3862 Lines: 96 On Mon, Aug 17, 2015 at 09:06:59AM -0700, Andy Lutomirski wrote: > >> expected course of actions is: > >> 1) CPU hits 'test_function' > >> 2) DB exception is triggered, with RFLAGS.RF=0 > >> 3) DB exception handler sets regs->RFLAGS.RF=1 and perf handler > >> triggers irq_work pending work > >> 4) DB exception executes iretd > >> 5) irq_work interrupt is triggered, with RFLAGS.RF=1 > >> 6) irq_work interrupt calls kill_fasync with SIGIO signal > >> 7) irq_work interrupt on return to userspace calls prepare_exit_to_usermode > >> which actually delivers the SIGIO signal > >> 8) sigreturn syscall prepare registers to return to the > >> instruction from step 1) and sets RFLAGS.RF to the its original > >> value from step 5) (RFLAGS.RF=1) > >> 9) CPU hits 'test_function' and DB exception is NOT triggered > >> due to RFLAGS.RF=1 > >> > >> this is how I see it works on Intel > >> > >> But AMD gives me RFLAGS.RF=0 on step 5, which makes the step 9 to > >> trigger the DB exception once again and makes the test fail. Waaaiit a minute! APM says #DB exception handler must set RF in the EFLAGS image on the exception stack (or whereever it is running) so that the breakpoint doesn't trigger again. Now: do_debug() *doesn't* do that but hw_breakpoint_handler() does. So do we call hw_breakpoint_handler() in those steps above? Because if we don't, that could explain the issue... > > Adding Andy, he might have an idea. Leaving in the rest for reference. > > Gee thanks :-p For what, adding you to CC or leaving in the rest? :-P > Jiri, did you instrument the code and observe do_IRQ sees RF clear in > its pt_regs? Also, it might be worth checking that regs->ip in the > irq_work matches regs->ip. Hohumm. > It's *possible* that I messed up and broke RF restore with > opportunistic sysret, but the code looks correct: > > testq $(X86_EFLAGS_RF|X86_EFLAGS_TF), %r11 > jnz opportunistic_sysret_failed Yeah, I was looking at that too. > >> An IRET that sets the RF bit. > >> JMP, CALL, or INTn through a task gate. > >> In both of the above cases, RF is not cleared to 0 until the next instruction successfully executes. > >> When an exception occurs (or when a string instruction is interrupted), the processor normally sets > >> RF=1 in the RFLAGS image saved on the interrupt stack. However, when a #DB exception occurs as a > >> result of an instruction breakpoint, the processor clears the RF bit to 0 in the interrupt-stack RFLAGS > >> image. > > That's a little weird, I think. Shouldn't RF be zero on #DB due to a > *watchpoint* so that a watchpoint followed immediately by a breakpoint > works? What is a watchpoint? R/Wn bit = 1? Btw, that sounds weird - why would the #DB exception clear RF just so that the #DB handler to set it right after... I'm probably missing something obvious. > >> • For other cases, the value pushed for RF is the value that was in EFLAG.RF at the time the event handler was > >> called. This includes: > >> — Debug exceptions generated in response to instruction breakpoints > >> — Hardware-generated interrupts arriving between instructions (including those arriving after the last > >> iteration of a repeated string instruction) > > This appears to be why it works on Intel. Does AMD not do that? We > could probably work around this in software (by not using irq work for > this), but yuck. See above. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/