Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp296443ybi; Thu, 1 Aug 2019 19:18:49 -0700 (PDT) X-Google-Smtp-Source: APXvYqw+1lUJ+6LPXHCNk+3dGG0daxAK/lBoFm/kOgP8qmzCzOYlJ3uBiJbweS8xVMBwlY76tJDU X-Received: by 2002:a17:90a:d343:: with SMTP id i3mr1950567pjx.15.1564712329506; Thu, 01 Aug 2019 19:18:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564712329; cv=none; d=google.com; s=arc-20160816; b=fkvLNFHbaDBrXn96XIGQBehs8jfUK+kaktd9DyrRKXVi+81YLKyDnfnchcWRd4FsGD wtqNzOxJHy5n+m1KyhqLDSj8Uw5bKi8j4v9B+1EhGwMyEiejE8WfHm0yHV2kSq0UXEgC J//M9EenHXCLE9O+P7+PdFaNhZWsC7z/+WUfw06YnyiAy7NyjT9Nf4/kPEst68TRU/IO +vSpd8M7Wvw3RDSUAbenZeyVONnY+XekgfvLqIxtw/P9Tp/t1rspJPWGiKJokn0G7GQ3 ayALaJFQSLgQvcWpAcEh/tLFPH8l5+NMCQrxL40kQA4aoCpNqfYLtpZzSlsT/KI0i9r0 nWag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date; bh=+Ww/MmKWC6FI90P1LT8B7Hc8IzDdrh1EcEngOe1Zhw4=; b=YpIDtlDpie9S/1gMzoXwpFLIY1thrFkB0nnaLsP5/sUEaO0XnV4bajKKWWXtMfj5Ka btZeN5bJlKjbE2IICebBbsul5kqWk/Rc7QQHMpGijgO3hqPJfWefgejfN596h7yJiDd5 6siNKGNygo8IPVmdNSm83hVJaTiY+YgLclA7ccsI9Z1rMKysybM8xt0OIKigS4oqeLaR 0IaLGQEQStLcZ1ZXnumtL/fserj2mBNE81FMUXtKIL/CXx0zErcBTt+f5FiQmKtnRVMC 3rc2ePgE1CCkrOk/CiNCMNAaymIIPMNItXUC0yUM5pJYu+2CLgNVchni+IPqcO78EADp LB5w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t15si34744995plo.360.2019.08.01.19.18.34; Thu, 01 Aug 2019 19:18:49 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732741AbfHAVeb (ORCPT + 99 others); Thu, 1 Aug 2019 17:34:31 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:37931 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728248AbfHAVea (ORCPT ); Thu, 1 Aug 2019 17:34:30 -0400 Received: from pd9ef1cb8.dip0.t-ipconnect.de ([217.239.28.184] helo=nanos) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1htIiO-0002Tq-7E; Thu, 01 Aug 2019 23:34:24 +0200 Date: Thu, 1 Aug 2019 23:34:23 +0200 (CEST) From: Thomas Gleixner To: "Lendacky, Thomas" cc: Peter Zijlstra , "linux-kernel@vger.kernel.org" , "x86@kernel.org" , Ingo Molnar , Borislav Petkov , Arnaldo Carvalho de Melo , Alexander Shishkin , Namhyung Kim , Jiri Olsa , Jerry Hoemann Subject: Re: [PATCH] perf/x86/amd: Change NMI latency mitigation to use a timestamp In-Reply-To: Message-ID: References: <833ee307989ac6bfb45efe823c5eca4b2b80c7cf.1564685848.git.thomas.lendacky@amd.com> <20190801211613.GB3578@hirez.programming.kicks-ass.net> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 1 Aug 2019, Lendacky, Thomas wrote: > On 8/1/19 4:16 PM, Peter Zijlstra wrote: > > On Thu, Aug 01, 2019 at 06:57:41PM +0000, Lendacky, Thomas wrote: > >> From: Tom Lendacky > >> > >> It turns out that the NMI latency workaround from commit 6d3edaae16c6 > >> ("x86/perf/amd: Resolve NMI latency issues for active PMCs") ends up > >> being too conservative and results in the perf NMI handler claiming NMIs > >> to easily on AMD hardware when the NMI watchdog is active. > >> > >> This has an impact, for example, on the hpwdt (HPE watchdog timer) module. > >> This module can produce an NMI that is used to reset the system. It > >> registers an NMI handler for the NMI_UNKNOWN type and relies on the fact > >> that nothing has claimed an NMI so that its handler will be invoked when > >> the watchdog device produces an NMI. After the referenced commit, the > >> hpwdt module is unable to process its generated NMI if the NMI watchdog is > >> active, because the current NMI latency mitigation results in the NMI > >> being claimed by the perf NMI handler. > >> > >> Update the AMD perf NMI latency mitigation workaround to, instead, use a > >> window of time. Whenever a PMC is handled in the perf NMI handler, set a > >> timestamp which will act as a perf NMI window. Any NMIs arriving within > >> that window will be claimed by perf. Anything outside that window will > >> not be claimed by perf. The value for the NMI window is set to 100 msecs. > >> This is a conservative value that easily covers any NMI latency in the > >> hardware. While this still results in a window in which the hpwdt module > >> will not receive its NMI, the window is now much, much smaller. > > > > Blergh, I so hate all this. The proposed patch is basically duct tape. > > Yeah, I'm not a fan either. > > > > > The horribly retarded x86 NMI infrastructure strikes again :/ > > > > Tom; do you have any idea how expensive it is to twiddle CR8 and play > > games with interrupt priorities instead of piling world + dog on this > > one NMI line? (as compared to CLI/STI) > > I can check on that. What are you thinking? Avoid the whole NMI mess, make the PMC interrupt a proper vector in the highest prio bucket and instead of using CLI/STI use CR8. That would have the additional advantage that we could prevent perf "NMI" then occsionally :) Thanks, tglx