Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp618486imu; Fri, 25 Jan 2019 08:02:38 -0800 (PST) X-Google-Smtp-Source: ALg8bN7zADawAwyW6qmKWZdGEzr2PyuyALwSelM4w9pXUQvD4VPbplSWJyRSX26PSrv1IZD4L/nh X-Received: by 2002:a62:1a91:: with SMTP id a139mr11907028pfa.64.1548432158789; Fri, 25 Jan 2019 08:02:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548432158; cv=none; d=google.com; s=arc-20160816; b=DxvsvDfV6ECzLv2r0GSHillye3tn7qtN7SbWGU2tlicxXqvF4cMufTmU37CY5U7amY HXkvUyj9syf1aL6BJZijhqJmEX6AtRyLrPkwqyLJOCUUwr87Numh+Yr/yuqxSMWNtw0J BY96nzWFEGc5EZ42tu/mlC+as/TaunieDFuIHgTMol0IjEpFFp9BpOcNImVVuLWmITyP ylGhd5Szf8jY0SJBsSszm7TPeWX9Fl4DUnVEZlfGRf9xC7PwUA5eeMIWGlvjhoaA0Xu1 NOEgO+JYMdctOTjMzG++/5Nif9DLlgSx2qVwoaglxPSCLosNRKaIBKb5v+ncYAeckhGQ jBrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=M49vrU7FY2CnpuNmwdybLWOJyWXc/e13f746idvCGMM=; b=Vx1GCMBE3IAb8elbClReCVEOlJ1xZWD25eDnsfl5W9XZsN355pdLDZzx4Ix/oFMnJR 9eFCOT/IUw8ntcuItYhFosyQO5MUVnmSTCPYeFDFfIJpLWXEUBcp/tn2HxfUsnaFXXi0 g3mjk4vyhWaKEQzMUhLd4mqmCqDu9qSnxt5aRvr6Kvyj3wOFirxOZ7GlzU+oMuiCRKjP yugHtq+ZzT0EMKo4Sddbvoi2LVudzQkfp0kzuMjmv0hIVNcYVHumt0A/KccSrB+solhv D2yJFKydDCQk6OgWcfs+qv0hL13dzFJFpTu26PWHElksHA5upWck65gPv7VGUIRkfdLL PQkQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v68si25064847pgb.70.2019.01.25.08.02.22; Fri, 25 Jan 2019 08:02:38 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728289AbfAYQA5 (ORCPT + 99 others); Fri, 25 Jan 2019 11:00:57 -0500 Received: from mga06.intel.com ([134.134.136.31]:12904 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726122AbfAYQA5 (ORCPT ); Fri, 25 Jan 2019 11:00:57 -0500 X-Amp-Result: UNSCANNABLE X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 25 Jan 2019 08:00:56 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,521,1539673200"; d="scan'208";a="111064134" Received: from tassilo.jf.intel.com (HELO tassilo.localdomain) ([10.7.201.137]) by orsmga006.jf.intel.com with ESMTP; 25 Jan 2019 08:00:56 -0800 Received: by tassilo.localdomain (Postfix, from userid 1000) id 56EB8300FE9; Fri, 25 Jan 2019 08:00:56 -0800 (PST) Date: Fri, 25 Jan 2019 08:00:56 -0800 From: Andi Kleen To: Ravi Bangoria Cc: lkml , Jiri Olsa , Peter Zijlstra , linux-perf-users@vger.kernel.org, Arnaldo Carvalho de Melo , eranian@google.com, vincent.weaver@maine.edu, "Naveen N. Rao" Subject: Re: System crash with perf_fuzzer (kernel: 5.0.0-rc3) Message-ID: <20190125160056.GG6118@tassilo.jf.intel.com> References: <7c7ec3d9-9af6-8a1d-515d-64dcf8e89b78@linux.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <7c7ec3d9-9af6-8a1d-515d-64dcf8e89b78@linux.ibm.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > [Fri Jan 25 10:28:53 2019] perf: interrupt took too long (2501 > 2500), lowering kernel.perf_event_max_sample_rate to 79750 > [Fri Jan 25 10:29:08 2019] perf: interrupt took too long (3136 > 3126), lowering kernel.perf_event_max_sample_rate to 63750 > [Fri Jan 25 10:29:11 2019] perf: interrupt took too long (4140 > 3920), lowering kernel.perf_event_max_sample_rate to 48250 > [Fri Jan 25 10:29:11 2019] perf: interrupt took too long (5231 > 5175), lowering kernel.perf_event_max_sample_rate to 38000 > [Fri Jan 25 10:29:11 2019] perf: interrupt took too long (6736 > 6538), lowering kernel.perf_event_max_sample_rate to 29500 These are fairly normal. > [Fri Jan 25 10:32:44 2019] ------------[ cut here ]------------ > [Fri Jan 25 10:32:44 2019] perfevents: irq loop stuck! I believe it's always possible to cause an irq loop. This happens when the PMU is programmed to cause PMIs on multiple counters too quickly. Maybe should just recover from it without printing such scary messages. Right now the scary message is justified because it resets the complete PMU. Perhaps need to be a bit more selective resetting on only the events that loop. > [Fri Jan 25 10:32:44 2019] WARNING: CPU: 1 PID: 0 at arch/x86/events/intel/core.c:2440 intel_pmu_handle_irq+0x158/0x170 This looks independent. I would apply the following patch (cut'n'pasted, so may need manual apply) and then run with cd /sys/kernel/debug/tracing echo 50000 > buffer_size_kb echo default_do_nmi > set_graph_function echo 1 > events/msr/enable echo 'msr != 0xc0000100 && msr != 0x6e0' > events/msr/write_msr/filter echo function_graph > current_tracer echo printk:traceoff > set_ftrace_filter echo 1 > tracing_on and then collect the trace from /sys/kernel/debug/tracing/trace after the oops. This should show the context of when it happens. diff --git a/kernel/events/Makefile b/kernel/events/Makefile index 3c022e33c109..8afc997110e0 100644 --- a/kernel/events/Makefile +++ b/kernel/events/Makefile @@ -1,7 +1,4 @@ # SPDX-License-Identifier: GPL-2.0 -ifdef CONFIG_FUNCTION_TRACER -CFLAGS_REMOVE_core.o = $(CC_FLAGS_FTRACE) -endif obj-y := core.o ring_buffer.o callchain.o