Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933274AbcKHMUl (ORCPT ); Tue, 8 Nov 2016 07:20:41 -0500 Received: from bombadil.infradead.org ([198.137.202.9]:40318 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932398AbcKHMUj (ORCPT ); Tue, 8 Nov 2016 07:20:39 -0500 Date: Tue, 8 Nov 2016 13:20:39 +0100 From: Peter Zijlstra To: Jiri Olsa Cc: Vince Weaver , Robert Richter , Yan Zheng , lkml , Ingo Molnar , Andi Kleen Subject: Re: [PATCH] perf/x86: Fix overlap counter scheduling bug Message-ID: <20161108122039.GP3142@twins.programming.kicks-ass.net> References: <1478015068-14052-1-git-send-email-jolsa@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1478015068-14052-1-git-send-email-jolsa@kernel.org> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2863 Lines: 86 On Tue, Nov 01, 2016 at 04:44:28PM +0100, Jiri Olsa wrote: > My fuzzer testing hits following warning in the counter scheduling code: > > WARNING: CPU: 0 PID: 0 at arch/x86/events/core.c:718 perf_assign_events+0x2ae/0x2c0 > Call Trace: > > dump_stack+0x68/0x93 > __warn+0xcb/0xf0 > warn_slowpath_null+0x1d/0x20 > perf_assign_events+0x2ae/0x2c0 > uncore_assign_events+0x1a7/0x250 [intel_uncore] > uncore_pmu_event_add+0x7a/0x3c0 [intel_uncore] > event_sched_in.isra.104+0xf6/0x2e0 > group_sched_in+0x6e/0x190 > ... > > The reason is that the counter scheduling code assumes > overlap constraints with mask weight < SCHED_STATES_MAX. > > This assumption is broken with uncore cbox constraint > added for snbep in: > 3b19e4c98c03 perf/x86: Fix event constraint for SandyBridge-EP C-Box 3b19e4c98c03 ("perf/x86: Fix event constraint for SandyBridge-EP C-Box") Is the right form. > It's also easily triggered by running following perf command > on snbep box: > # perf stat -e 'uncore_cbox_0/event=0x1f/,uncore_cbox_0/event=0x1f/,uncore_cbox_0/event=0x1f/' -a > > Fixing this by increasing the SCHED_STATES_MAX to 3 and adding build > check for EVENT_CONSTRAINT_OVERLAP macro. > -#define SCHED_STATES_MAX 2 > +#define SCHED_STATES_MAX 3 Us having to increase this is ff'ing sad :/ That's seriously challenged hardware :/ > + > +/* Check we dont overlap beyond the states max. */ > +#define OVERLAP_CHECK(n) (!!sizeof(char[1 - 2*!!(HWEIGHT(n) > SCHED_STATES_MAX)])) > +#define OVERLAP_HWEIGHT(n) (OVERLAP_CHECK(n)*HWEIGHT(n)) I'm not sure I get how this is correct at all. You cannot tell by a single mask what the overlap is. You need all the masks. The point is that that PMU has constraints like: 0x01 - 0001 0x03 - 0011 0x0e - 1110 0x0c - 1100 Which gets us a total of 4 overlapping counter masks, and that would indeed lead to max 3 retries I think. Now, I would much rather solve this by changing the constraint like the below, that yields: 0x01 - 0001 0x03 - 0011 0x0c - 1100 Which is two distinct groups, only one of which has overlap. And the one with overlap only has 2 overlapping masks, giving a max reties of 1. diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c index 272427700d48..71bc348736bd 100644 --- a/arch/x86/events/intel/uncore_snbep.c +++ b/arch/x86/events/intel/uncore_snbep.c @@ -669,7 +669,7 @@ static struct event_constraint snbep_uncore_cbox_constraints[] = { UNCORE_EVENT_CONSTRAINT(0x1c, 0xc), UNCORE_EVENT_CONSTRAINT(0x1d, 0xc), UNCORE_EVENT_CONSTRAINT(0x1e, 0xc), - EVENT_CONSTRAINT_OVERLAP(0x1f, 0xe, 0xff), + UNCORE_EVENT_CONSTRAINT(0x1f, 0xc); /* should be 0x0e but that gives scheduling pain */ UNCORE_EVENT_CONSTRAINT(0x21, 0x3), UNCORE_EVENT_CONSTRAINT(0x23, 0x3), UNCORE_EVENT_CONSTRAINT(0x31, 0x3),