Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935270AbcCPXLG (ORCPT ); Wed, 16 Mar 2016 19:11:06 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:58278 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752562AbcCPXLD (ORCPT ); Wed, 16 Mar 2016 19:11:03 -0400 Date: Thu, 17 Mar 2016 00:10:42 +0100 From: Peter Zijlstra To: Vince Weaver Cc: mingo@kernel.org, alexander.shishkin@linux.intel.com, eranian@google.com, linux-kernel@vger.kernel.org, dvyukov@google.com, andi@firstfloor.org, jolsa@redhat.com, panand@redhat.com, sasha.levin@oracle.com, oleg@redhat.com, Borislav Petkov Subject: Re: [PATCH 00/12] perf: more fixes Message-ID: <20160316231042.GT6375@twins.programming.kicks-ass.net> References: <20160224174539.570749654@infradead.org> <20160310143924.GR6356@twins.programming.kicks-ass.net> <20160315153830.GA6356@twins.programming.kicks-ass.net> <20160316225933.GS6375@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160316225933.GS6375@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1505 Lines: 37 On Wed, Mar 16, 2016 at 11:59:33PM +0100, Peter Zijlstra wrote: > Subject: perf, ibs: Fix race with IBS_STARTING state > From: Peter Zijlstra > Date: Wed Mar 16 23:55:21 CET 2016 > > While tracing the IBS bits I saw the NMI hitting between clearing > IBS_STARTING and the actual MSR writes to disable the counter. > > Since IBS_STARTING was cleared, the handler assumed these were spurious > NMIs and because STOPPING wasn't set yet either, insta-triggered an > "Unknown NMI". > > Cure this by clearing IBS_STARTING after disabling the hardware. > > Signed-off-by: Peter Zijlstra (Intel) > --- > arch/x86/events/amd/ibs.c | 32 +++++++++++++++++++++++++++++--- > 1 file changed, 29 insertions(+), 3 deletions(-) > > --- a/arch/x86/events/amd/ibs.c > +++ b/arch/x86/events/amd/ibs.c > @@ -376,7 +376,13 @@ static void perf_ibs_start(struct perf_e > hwc->state = 0; > > perf_ibs_set_period(perf_ibs, hwc, &period); > + /* > + * Set STARTED before enabling the hardware, such that > + * a subsequent NMI must observe it. Then clear STOPPING > + * such that we don't consume NMIs by accident. > + */ > set_bit(IBS_STARTED, pcpu->state); > + clear_bit(IBS_STOPPING, pcpu->state); > perf_ibs_enable_event(perf_ibs, hwc, period >> 4); Also, all those atomic ops are probably entirely overkill and we could use the non-atomic ops. This is all strictly cpu local. But I didn't want to change too much at once, esp. while there's still problems.