Received: by 2002:ac0:a874:0:0:0:0:0 with SMTP id c49csp380081ima; Fri, 15 Mar 2019 05:04:27 -0700 (PDT) X-Google-Smtp-Source: APXvYqxswI0bx4mjpoGHPSHfBb1IffqXb+bMVHb6tMH8/qb2+Qm4FNWD3k66FTik8+EtbwuCF94A X-Received: by 2002:a17:902:b216:: with SMTP id t22mr3836794plr.39.1552651467711; Fri, 15 Mar 2019 05:04:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1552651467; cv=none; d=google.com; s=arc-20160816; b=0/4lKa7OOsyTvGWDv/krTbu/QKf4HOu+4So8H7jhzI6eH18wHet9Qc9qHKfrm/EE/F yrCRd5a4PkE/uYmKCeE/UU5rhEcBW4QMV9UzSIwrOXInYVo6TnnGyDzJQlTH0kbvPtEu 2cWF0E/wABVsjr1uNk9iSuBjnWjjmuozQvilLIdBAAQ8kBk6Q+iWJhXJ+K/WRCTybtTF EKEtA6f83eNn08227MsSbFTmHwgH3wrTatnqX6shIKKUgGZNwK+xv3VISMRZO7BprQd1 /XSSsXAZ+MCQqLZxUfFviQjpWe59mqRXPW4dJAGIaZn+XPGutlegSk962sKaGVdz6BHr Dvkg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=WzxXKYm1YsUi7LIyFK/BfGSJlhvaM3nxioJ6/CDHwJQ=; b=QsHjVx+b/cTbDKi+bdeBuzUbu+3AsszLLhn1xup275odMqq5wCGwzBDb/ZH87FFOmx WCfdZWdqzSE7Tm8N9EgSlbQZ5PbtIWWrfp2XLf6ErKQ+c0M0Q5kYEUsop8JABmo9ZC0O /yWkAvNzL8Hj7XZfk8VS1xBpZRLaaG6Edsuqv1k0yUukcszFGoO6c6gKAxdBszGzKK3p s8BCT3Rn6B/NwpozEndP9U5meandhIo2cV4B50vLpLO8tmr/BUGaApxDZeD0siTqQiDW RyE9eU29Cz6P8LcYr8eVK7AQyomBnG2PUriM+tuJ6+jpqrpm2zA179iiTJ+6+cAI0G/5 Zoww== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=A8zG5Dg0; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u12si1649749plq.183.2019.03.15.05.04.10; Fri, 15 Mar 2019 05:04:27 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=A8zG5Dg0; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728185AbfCOMDY (ORCPT + 99 others); Fri, 15 Mar 2019 08:03:24 -0400 Received: from merlin.infradead.org ([205.233.59.134]:39440 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727441AbfCOMDX (ORCPT ); Fri, 15 Mar 2019 08:03:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=merlin.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=WzxXKYm1YsUi7LIyFK/BfGSJlhvaM3nxioJ6/CDHwJQ=; b=A8zG5Dg0VB97zAuEcjk59IXrx Mfi5X0nIRh7xtwVZQxJUcLYwEaxzkRm7aF8pJhZsdsDfEqeGSle2h9PbL3gFXs314R6EhgQ8fz+W8 jjy1SF7CqEeeDHwR6IdXaufG6d2EyEPDSw/i6bhKU71lchUuqWIUbUKtq3/j+Ekp4LXS1zqyFX64j 2s4+YmOrpqw9PhOISwYJldQLLoP87J3EaRvW9apy8WXapV9NSUL5PxeDKcNC+SOXldoGfK8JUHCOL Sdl+trwerU5Tsp+S9Mq5OaihLa9rCPdxtaagSQMaZCNfMZjvM9CJWFnlx6/hG1XWiufD2JrtLoQWT hjr0DK26w==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=hirez.programming.kicks-ass.net) by merlin.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1h4lYQ-0005XT-Mw; Fri, 15 Mar 2019 12:03:15 +0000 Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id 7685C2142294F; Fri, 15 Mar 2019 13:03:11 +0100 (CET) Date: Fri, 15 Mar 2019 13:03:11 +0100 From: Peter Zijlstra To: "Lendacky, Thomas" Cc: "x86@kernel.org" , "linux-kernel@vger.kernel.org" , Arnaldo Carvalho de Melo , Alexander Shishkin , Ingo Molnar , Borislav Petkov , Namhyung Kim , Thomas Gleixner , Jiri Olsa Subject: Re: [RFC PATCH 2/2] x86/perf/amd: Resolve NMI latency issues when multiple PMCs are active Message-ID: <20190315120311.GX5996@hirez.programming.kicks-ass.net> References: <155232291547.21417.2499429555505085131.stgit@tlendack-t1.amdoffice.net> <155232292961.21417.3665243457569518550.stgit@tlendack-t1.amdoffice.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <155232292961.21417.3665243457569518550.stgit@tlendack-t1.amdoffice.net> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 11, 2019 at 04:48:51PM +0000, Lendacky, Thomas wrote: > @@ -467,6 +470,45 @@ static void amd_pmu_wait_on_overflow(int idx, u64 config) > } > } > > +/* > + * Because of NMI latency, if multiple PMC counters are active we need to take > + * into account that multiple PMC overflows can generate multiple NMIs but be > + * handled by a single invocation of the NMI handler (think PMC overflow while > + * in the NMI handler). This could result in subsequent unknown NMI messages > + * being issued. > + * > + * Attempt to mitigate this by using the number of active PMCs to determine > + * whether to return NMI_HANDLED if the perf NMI handler did not handle/reset > + * any PMCs. The per-CPU perf_nmi_counter variable is set to a minimum of one > + * less than the number of active PMCs or 2. The value of 2 is used in case the > + * NMI does not arrive at the APIC in time to be collapsed into an already > + * pending NMI. LAPIC I really do hope?! > + */ > +static int amd_pmu_mitigate_nmi_latency(unsigned int active, int handled) > +{ > + /* If multiple counters are not active return original handled count */ > + if (active <= 1) > + return handled; Should we not reset perf_nmi_counter in this case? > + > + /* > + * If a counter was handled, record the number of possible remaining > + * NMIs that can occur. > + */ > + if (handled) { > + this_cpu_write(perf_nmi_counter, > + min_t(unsigned int, 2, active - 1)); > + > + return handled; > + } > + > + if (!this_cpu_read(perf_nmi_counter)) > + return NMI_DONE; > + > + this_cpu_dec(perf_nmi_counter); > + > + return NMI_HANDLED; > +} > + > static struct event_constraint * > amd_get_event_constraints(struct cpu_hw_events *cpuc, int idx, > struct perf_event *event) > @@ -689,6 +731,7 @@ static __initconst const struct x86_pmu amd_pmu = { > > .amd_nb_constraints = 1, > .wait_on_overflow = amd_pmu_wait_on_overflow, > + .mitigate_nmi_latency = amd_pmu_mitigate_nmi_latency, > }; Again, you could just do amd_pmu_handle_irq() and avoid an extra callback. Anyway, we already had code to deal with spurious NMIs from AMD; see commit: 63e6be6d98e1 ("perf, x86: Catch spurious interrupts after disabling counters") And that looks to be doing something very much the same. Why then do you still need this on top?