Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp109348ybl; Mon, 12 Aug 2019 12:37:54 -0700 (PDT) X-Google-Smtp-Source: APXvYqxv+0o/zJQaDneZvzlydGfdLy7IfJ6gQRECET18NVU29Mj+fyxU0EXdMgZro9o3BwCRg5CL X-Received: by 2002:a65:4189:: with SMTP id a9mr30667782pgq.399.1565638674637; Mon, 12 Aug 2019 12:37:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565638674; cv=none; d=google.com; s=arc-20160816; b=shwAOmL39QzY0QYQd3vFpp95bZuqPdFr5vXZDKVIkF0HvtNspeHHvA/4wF4bRCrTyV OFmNjmhddlB1OyGU+mMiTva8/9YA+QefGX66MJcNR8ghjk9vXWauPDDYUFM1Os8RWd2B 4ZyMya/F3SK43Cc5XS0bhCI5IP5s0B83iAhyASuEcIHtpdp8vxmCl+ZkXgbwT7yqadNX k2nQvEX1K4w4aQOIoLFPHmRL2FqdwaEnOwtQzRFe0txdRrgiPN3WrCg10NZZmbOT5KHY cAekmw+YvmHL+azitIA9tMhahvczKY1nZoaXeKokzDdb3W46JR0y93eA7Tkmtoihs16Q odCw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date; bh=/bSmwsFSu71jE52HbRXeT7hzCAhFDfd2kibu1cDLdxc=; b=hyQcmjcI2VCl8j0tYVS5frGSQOP6ufIS/phFiZUElmkfHAuJ3WBXKRnczZpu5tQNjz 6j/JQ4suQj2j91//ikvVTCZrUlgBGRONCL5kuVOaV1LUxeAqYLkE2G7eZveCe3sjZQ9k nmV1xUeU0j7vValI3rtzTsNJNUjCN0b+mQMvUP7rsH2GEbtwKIPWUSvYnFxtGtQ0at78 1WAlPthnOjH+9VFq/79whHorhZM4HJ4ydrTp5jUu6KNMnxYYuUOqsItVbHbtsGBBVUel efg6S60DsKn4MVkLw5m+/SNFjfLZLRAq/3d6sXXh3+cn185LM4l4dWOcPUXhWeWdNnAu eXGA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id bg3si55403969plb.83.2019.08.12.12.37.39; Mon, 12 Aug 2019 12:37:54 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726856AbfHLTex (ORCPT + 99 others); Mon, 12 Aug 2019 15:34:53 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:60578 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726749AbfHLTex (ORCPT ); Mon, 12 Aug 2019 15:34:53 -0400 Received: from p200300ddd71876867e7a91fffec98e25.dip0.t-ipconnect.de ([2003:dd:d718:7686:7e7a:91ff:fec9:8e25]) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1hxG5V-0004X8-7w; Mon, 12 Aug 2019 21:34:37 +0200 Date: Mon, 12 Aug 2019 21:34:31 +0200 (CEST) From: Thomas Gleixner To: Josh Hunt cc: Andi Kleen , Peter Zijlstra , Cong Wang , "Liang, Kan" , jolsa@redhat.com, bigeasy@linutronix.de, "H. Peter Anvin" , Ingo Molnar , x86 , LKML Subject: Re: Long standing kernel warning: perfevents: irq loop stuck! In-Reply-To: Message-ID: References: <20180223121456.GZ25201@hirez.programming.kicks-ass.net> <20180226203937.GA21543@tassilo.jf.intel.com> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 12 Aug 2019, Josh Hunt wrote: > On Mon, Aug 12, 2019 at 10:55 AM Thomas Gleixner wrote: > > > > On Mon, 12 Aug 2019, Josh Hunt wrote: > > > Was there any progress made on debugging this issue? We are still > > > seeing it on 4.19.44: > > > > I haven't seen anyone looking at this. > > > > Can you please try the patch Ingo posted: > > > > https://lore.kernel.org/lkml/20150501070226.GB18957@gmail.com/ > > > > and if it fixes the issue decrease the value from 128 to the point where it > > comes back, i.e. 128 -> 64 -> 32 ... > > > > Thanks, > > > > tglx > > I just checked the machines where this problem occurs and they're both > Nehalem boxes. I think Ingo's patch would only help Haswell machines. > Please let me know if I misread the patch or if what I'm seeing is a > different issue than the one Cong originally reported. Find the NHM hack below. Thanks, tglx 8<---------------- diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index 648260b5f367..93c1a4f0e73e 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3572,6 +3572,11 @@ static u64 bdw_limit_period(struct perf_event *event, u64 left) return left; } +static u64 nhm_limit_period(struct perf_event *event, u64 left) +{ + return max(left, 128ULL); +} + PMU_FORMAT_ATTR(event, "config:0-7" ); PMU_FORMAT_ATTR(umask, "config:8-15" ); PMU_FORMAT_ATTR(edge, "config:18" ); @@ -4606,6 +4611,7 @@ __init int intel_pmu_init(void) x86_pmu.pebs_constraints = intel_nehalem_pebs_event_constraints; x86_pmu.enable_all = intel_pmu_nhm_enable_all; x86_pmu.extra_regs = intel_nehalem_extra_regs; + x86_pmu.limit_period = nhm_limit_period; mem_attr = nhm_mem_events_attrs;