Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755671AbbHQQHV (ORCPT ); Mon, 17 Aug 2015 12:07:21 -0400 Received: from mail-oi0-f47.google.com ([209.85.218.47]:35972 "EHLO mail-oi0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754608AbbHQQHU convert rfc822-to-8bit (ORCPT ); Mon, 17 Aug 2015 12:07:20 -0400 MIME-Version: 1.0 In-Reply-To: <20150817043603.GB9387@nazgul.tnic> References: <20150816222956.GA14290@krava.brq.redhat.com> <20150817043603.GB9387@nazgul.tnic> From: Andy Lutomirski Date: Mon, 17 Aug 2015 09:06:59 -0700 Message-ID: Subject: Re: [BUG/RFC] perf test fails on AMD CPUs To: Borislav Petkov Cc: Jiri Olsa , "linux-kernel@vger.kernel.org" , X86 ML , Peter Zijlstra , Ingo Molnar , Robert Richter , "H. Peter Anvin" , Thomas Gleixner , Arnaldo Carvalho de Melo , Namhyung Kim , Jan Stancek Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4267 Lines: 105 On Sun, Aug 16, 2015 at 9:36 PM, Borislav Petkov wrote: > On Mon, Aug 17, 2015 at 12:29:56AM +0200, Jiri Olsa wrote: >> hi, >> 'perf test 18' is failing on systems with AMD processor. > > Hmm, still using that b0rked test box? :-) > > Also, which kernel? > > There have been substantial changes to the entry code recently. Although > I don't see anything being done differently on AMD there except > X86_BUG_SYSRET_SS_ATTRS but that should be unrelated. > >> The only reason I could find is that AMD does not set 'resume flag' >> in RFLAGS register the way the Intel CPU does. >> >> (simplified) test scenario: >> >> - create breakpoint (on test_function) perf event with SIGIO signal >> to be delivered any time the breakpoint is hit >> - run test_function >> >> >> expected course of actions is: >> 1) CPU hits 'test_function' >> 2) DB exception is triggered, with RFLAGS.RF=0 >> 3) DB exception handler sets regs->RFLAGS.RF=1 and perf handler >> triggers irq_work pending work >> 4) DB exception executes iretd >> 5) irq_work interrupt is triggered, with RFLAGS.RF=1 >> 6) irq_work interrupt calls kill_fasync with SIGIO signal >> 7) irq_work interrupt on return to userspace calls prepare_exit_to_usermode >> which actually delivers the SIGIO signal >> 8) sigreturn syscall prepare registers to return to the >> instruction from step 1) and sets RFLAGS.RF to the its original >> value from step 5) (RFLAGS.RF=1) >> 9) CPU hits 'test_function' and DB exception is NOT triggered >> due to RFLAGS.RF=1 >> >> this is how I see it works on Intel >> >> But AMD gives me RFLAGS.RF=0 on step 5, which makes the step 9 to >> trigger the DB exception once again and makes the test fail. > > Adding Andy, he might have an idea. Leaving in the rest for reference. Gee thanks :-p Jiri, did you instrument the code and observe do_IRQ sees RF clear in its pt_regs? Also, it might be worth checking that regs->ip in the irq_work matches regs->ip. It's *possible* that I messed up and broke RF restore with opportunistic sysret, but the code looks correct: testq $(X86_EFLAGS_RF|X86_EFLAGS_TF), %r11 jnz opportunistic_sysret_failed > >> I'm not sure this test ever worked on AMD CPUs, anyway is there >> anything I'm missing or is this some AMD/Intel quirk? >> >> thanks, >> jirka >> >> >> >> AMD description of RF flag (SDM 3.1.6): >> ======================================= >> Resume Flag (RF) Bit. Bit 16. The RF bit allows an instruction to be restarted following an >> instruction breakpoint resulting in a debug exception (#DB). This bit prevents multiple debug >> exceptions from occurring on the same instruction. >> The processor clears the RF bit after every instruction is successfully executed, except when the >> instruction is: >> • >> • >> An IRET that sets the RF bit. >> JMP, CALL, or INTn through a task gate. >> In both of the above cases, RF is not cleared to 0 until the next instruction successfully executes. >> When an exception occurs (or when a string instruction is interrupted), the processor normally sets >> RF=1 in the RFLAGS image saved on the interrupt stack. However, when a #DB exception occurs as a >> result of an instruction breakpoint, the processor clears the RF bit to 0 in the interrupt-stack RFLAGS >> image. That's a little weird, I think. Shouldn't RF be zero on #DB due to a *watchpoint* so that a watchpoint followed immediately by a breakpoint works? >> • For other cases, the value pushed for RF is the value that was in EFLAG.RF at the time the event handler was >> called. This includes: >> — Debug exceptions generated in response to instruction breakpoints >> — Hardware-generated interrupts arriving between instructions (including those arriving after the last >> iteration of a repeated string instruction) This appears to be why it works on Intel. Does AMD not do that? We could probably work around this in software (by not using irq work for this), but yuck. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/