Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753095AbcDBLRb (ORCPT ); Sat, 2 Apr 2016 07:17:31 -0400 Received: from ud10.udmedia.de ([194.117.254.50]:48162 "EHLO mail.ud10.udmedia.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751212AbcDBLRb (ORCPT ); Sat, 2 Apr 2016 07:17:31 -0400 Date: Sat, 2 Apr 2016 13:17:26 +0200 From: Markus Trippelsdorf To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, Ingo Molnar , Arnaldo Carvalho de Melo , Alexander Shishkin Subject: Re: "perf: interrupt took too long" messages even with perf_cpu_time_max_percent==0 Message-ID: <20160402111726.GB305@x4> References: <20160402093239.GA305@x4> <20160402110017.GP3448@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160402110017.GP3448@twins.programming.kicks-ass.net> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2026 Lines: 51 On 2016.04.02 at 13:00 +0200, Peter Zijlstra wrote: > On Sat, Apr 02, 2016 at 11:32:39AM +0200, Markus Trippelsdorf wrote: > > Current git kernel sometimes shows: > > > > perf: interrupt took too long (71 > 52), lowering kernel.perf_event_max_sample_rate to 300 > > perf: interrupt took too long (103 > 88), lowering kernel.perf_event_max_sample_rate to 300 > > perf: interrupt took too long (130 > 128), lowering kernel.perf_event_max_sample_rate to 300 > > perf: interrupt took too long (175 > 162), lowering kernel.perf_event_max_sample_rate to 300 > > perf: interrupt took too long (219 > 218), lowering kernel.perf_event_max_sample_rate to 300 > > ... > > > > when running e.g. "perf top" even when > > /proc/sys/kernel/perf_cpu_time_max_percent is set to 0. > > > Ah, was 0 also meant to disable it? Yes. From Documentation/sysctl/kernel.txt: perf_cpu_time_max_percent: Hints to the kernel how much CPU time it should be allowed to use to handle perf sampling events. If the perf subsystem is informed that its samples are exceeding this limit, it will drop its sampling frequency to attempt to reduce its CPU usage. Some perf sampling happens in NMIs. If these samples unexpectedly take too long to execute, the NMIs can become stacked up next to each other so much that nothing else is allowed to execute. 0: disable the mechanism. Do not monitor or correct perf's sampling rate no matter how CPU time it takes. 1-100: attempt to throttle perf's sample rate to this percentage of CPU. Note: the kernel calculates an "expected" length of each sample event. 100 here means 100% of that expected length. Even if this is set to 100, you may still see sample throttling if this length is exceeded. Set to 0 if you truly do not care how much CPU is consumed. > Does the below help? Thanks. I will test it later. But 91a612eea9a3 makes the assumption that only sysctl_perf_cpu_time_max_percent==100 disables the feature also in kernel/events/core.c. -- Markus