Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754263AbcKHS1o (ORCPT ); Tue, 8 Nov 2016 13:27:44 -0500 Received: from merlin.infradead.org ([205.233.59.134]:46310 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753548AbcKHS1n (ORCPT ); Tue, 8 Nov 2016 13:27:43 -0500 Date: Tue, 8 Nov 2016 19:27:39 +0100 From: Peter Zijlstra To: "Liang, Kan" Cc: Andi Kleen , Jiri Olsa , Vince Weaver , Robert Richter , lkml , Ingo Molnar Subject: Re: [PATCH] perf/x86: Fix overlap counter scheduling bug Message-ID: <20161108182739.GO3117@twins.programming.kicks-ass.net> References: <1478015068-14052-1-git-send-email-jolsa@kernel.org> <20161108122039.GP3142@twins.programming.kicks-ass.net> <20161108150949.GM26852@two.firstfloor.org> <37D7C6CF3E00A74B8858931C1DB2F07750C8D9D0@SHSMSX103.ccr.corp.intel.com> <20161108165749.GJ3117@twins.programming.kicks-ass.net> <37D7C6CF3E00A74B8858931C1DB2F07750C8DA4F@SHSMSX103.ccr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <37D7C6CF3E00A74B8858931C1DB2F07750C8DA4F@SHSMSX103.ccr.corp.intel.com> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1807 Lines: 50 On Tue, Nov 08, 2016 at 05:25:34PM +0000, Liang, Kan wrote: > > I think all the 0x3 mask need the overlap flag set, since they clearly overlap > > with the 0x1 masks. That would improve the scheduling. > > > > How much the overlap hint can improve the scheduling? > Because there is not only snbep_uncore_cbox, but also other uncore events > which have overlapping masks. Hurm, not much. We're saved by the fact that we schedule from wwin to wmax, which means that we first place the 0x01 events, and then try and fit the 0x03 events on top. That should already be good. /me ponders more.. The comment with EVENT_CONSTRAINT_OVERLAP states: "This is the case if the counter mask of such an event is not a subset of any other counter mask of a constraint with an equal or higher weight". Esp. that latter part is of interest here I think, our overlapping mask is 0x0e, that has 3 bits set and is the highest weight mask in on the PMU, therefore it will be placed last. Can we still create a scenario where we would need to rewind that? The scenario for AMD Fam15h is we're having masks like: 0x3F -- 111111 0x38 -- 111000 0x07 -- 000111 0x09 -- 001001 And we mark 0x09 as overlapping, because it is not a direct subset of 0x38 or 0x07 and has less weight than either of those. This means we'll first try and place the 0x09 event, then try and place 0x38/0x07 events. Now imagine we have: 3 * 0x07 + 0x09 and the initial pick for the 0x09 event is counter 0, then we'll fail to place all 0x07 events. So we'll pop back, try counter 4 for the 0x09 event, and then re-try all 0x07 events, which will now work. But given, that in the uncore case, the overlapping event is the heaviest mask, I don't think this can happen. Or did I overlook something.... takes a bit to page all this back in.