Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933149AbcKHQ5w (ORCPT ); Tue, 8 Nov 2016 11:57:52 -0500 Received: from bombadil.infradead.org ([198.137.202.9]:54630 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932237AbcKHQ5v (ORCPT ); Tue, 8 Nov 2016 11:57:51 -0500 Date: Tue, 8 Nov 2016 17:57:49 +0100 From: Peter Zijlstra To: "Liang, Kan" Cc: Andi Kleen , Jiri Olsa , Vince Weaver , Robert Richter , lkml , Ingo Molnar Subject: Re: [PATCH] perf/x86: Fix overlap counter scheduling bug Message-ID: <20161108165749.GJ3117@twins.programming.kicks-ass.net> References: <1478015068-14052-1-git-send-email-jolsa@kernel.org> <20161108122039.GP3142@twins.programming.kicks-ass.net> <20161108150949.GM26852@two.firstfloor.org> <37D7C6CF3E00A74B8858931C1DB2F07750C8D9D0@SHSMSX103.ccr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <37D7C6CF3E00A74B8858931C1DB2F07750C8D9D0@SHSMSX103.ccr.corp.intel.com> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1508 Lines: 40 On Tue, Nov 08, 2016 at 04:22:13PM +0000, Liang, Kan wrote: > > > > > > > > > > > diff --git a/arch/x86/events/intel/uncore_snbep.c > > > b/arch/x86/events/intel/uncore_snbep.c > > > index 272427700d48..71bc348736bd 100644 > > > --- a/arch/x86/events/intel/uncore_snbep.c > > > +++ b/arch/x86/events/intel/uncore_snbep.c > > > @@ -669,7 +669,7 @@ static struct event_constraint > > snbep_uncore_cbox_constraints[] = { > > > UNCORE_EVENT_CONSTRAINT(0x1c, 0xc), > > > UNCORE_EVENT_CONSTRAINT(0x1d, 0xc), > > > UNCORE_EVENT_CONSTRAINT(0x1e, 0xc), > > > - EVENT_CONSTRAINT_OVERLAP(0x1f, 0xe, 0xff), > > > + UNCORE_EVENT_CONSTRAINT(0x1f, 0xc); /* should be 0x0e but that > > gives > > > +scheduling pain */ > > I think the crash is caused by the overlap bit. > Why not just revert the previous patch? > > If overlap bit is removed, the perf_sched_save_state will never be touched. > Why we have to reduce a counter? By simply removing the overlap bit you'll still get bad scheduling (we'll just not crash). I think all the 0x3 mask need the overlap flag set, since they clearly overlap with the 0x1 masks. That would improve the scheduling. But as Jiri noted, you cannot do 0x1 + 0x3 + 0xc + 0xe without also raising the retry limit, because that are 4 overlapping masks, you'll have to, worst case, pop 3 attempts. By reducing 0xe to 0xc you'll not have 4 overlapping masks anymore. In any case, overlapping masks stink (because they make scheduling O(n!)) and ideally hardware would not do this.