Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753933AbaKMHXA (ORCPT ); Thu, 13 Nov 2014 02:23:00 -0500 Received: from mail-ob0-f176.google.com ([209.85.214.176]:51334 "EHLO mail-ob0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751923AbaKMHW6 (ORCPT ); Thu, 13 Nov 2014 02:22:58 -0500 MIME-Version: 1.0 In-Reply-To: References: <20141110153147.GK4901@linux.vnet.ibm.com> <20141111171526.GC3216@lerouge> <20141112135455.GA6895@lerouge> <20141112150651.GF21343@twins.programming.kicks-ass.net> Date: Thu, 13 Nov 2014 12:52:57 +0530 Message-ID: Subject: Re: Future of NOHZ full/isolation development (was Re: [NOHZ] Remove scheduler_tick_max_deferment) From: Viresh Kumar To: Peter Zijlstra Cc: Frederic Weisbecker , Christoph Lameter , "Paul E. McKenney" , Thomas Gleixner , Linux Kernel Mailing List , Gilad Ben-Yossef , Tejun Heo , John Stultz , Mike Frysinger , Minchan Kim , Hakan Akkan , Max Krasnyansky , Hugh Dickins , "H. Peter Anvin" , Ingo Molnar , Kevin Hilman Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 12 November 2014 20:46, Viresh Kumar wrote: > On 12 November 2014 20:36, Peter Zijlstra wrote: >> I don't think you need to add anything. We already have tracepoints for >> every single interrupt (and therefore also for the hrtimer one) and we >> have expiry tracepoints. > > I will crosscheck this again but as far as I remember these spurious interrupts > aren't caught with tracers currently. Not even the irq_enter/exit ones. So yes, we do get irq_handler_entry/exit traces for the clockevent device. The problem here is that people miss these as there are no warnings issued here. And so nobody tried to resolve these spurious interrupts. This is how it looks on a ARM machine (With trace_hrtimer_spurious), though it is relevant for all architectures which emulate ONESHOT mode over PERIODIC mode: -0 [001] d.h2 719.642085: irq_handler_entry: irq=30 name=arch_timer -0 [001] d.h3 719.646506: hrtimer_spurious: count:105 -0 [001] d.h2 719.646511: irq_handler_exit: irq=30 ret=handled -0 [001] d..3 719.646522: hrtimer_start: hrtimer=ee7bb740 function=tick_sched_timer expires=719640000000 softexpires=719640000000 -0 [001] d..3 719.646545: hrtimer_cancel: hrtimer=ee7bb740 -0 [001] d.h2 719.647086: irq_handler_entry: irq=30 name=arch_timer -0 [001] d.h3 719.651540: hrtimer_spurious: count:106 -0 [001] d.h2 719.651545: irq_handler_exit: irq=30 ret=handled -0 [001] d..3 719.651554: hrtimer_start: hrtimer=ee7bb740 function=tick_sched_timer expires=719645000000 softexpires=719645000000 -0 [001] d..3 719.651576: hrtimer_cancel: hrtimer=ee7bb740 -0 [001] d.h2 719.652084: irq_handler_entry: irq=30 name=arch_timer -0 [001] d.h3 719.656571: hrtimer_spurious: count:107 -0 [001] d.h2 719.656575: irq_handler_exit: irq=30 ret=handled -0 [001] d..3 719.656583: hrtimer_start: hrtimer=ee7bb740 function=tick_sched_timer expires=719650000000 softexpires=719650000000 -0 [001] d..3 719.656601: hrtimer_cancel: hrtimer=ee7bb740 -0 [001] d.h2 719.657084: irq_handler_entry: irq=30 name=arch_timer -0 [001] d.h3 719.661611: hrtimer_spurious: count:108 -0 [001] d.h2 719.661616: irq_handler_exit: irq=30 ret=handled -0 [001] d..3 719.661627: hrtimer_start: hrtimer=ee7bb740 function=tick_sched_timer expires=719655000000 softexpires=719655000000 -0 [001] d..3 719.661652: hrtimer_cancel: hrtimer=ee7bb740 -0 [001] d.h2 719.662086: irq_handler_entry: irq=30 name=arch_timer -0 [001] d.h3 719.666643: hrtimer_spurious: count:109 -0 [001] d.h2 719.666648: irq_handler_exit: irq=30 ret=handled -0 [001] d..3 719.666657: hrtimer_start: hrtimer=ee7bb740 function=tick_sched_timer expires=719660000000 softexpires=719660000000 -0 [001] d..3 719.666680: hrtimer_cancel: hrtimer=ee7bb740 -0 [001] d.h2 719.667084: irq_handler_entry: irq=30 name=arch_timer -0 [001] d.h3 719.671678: hrtimer_spurious: count:110 -0 [001] d.h2 719.671682: irq_handler_exit: irq=30 ret=handled -0 [001] d..3 719.671693: hrtimer_start: hrtimer=ee7bb740 function=tick_sched_timer expires=719665000000 softexpires=719665000000 -0 [001] d..3 719.671714: hrtimer_cancel: hrtimer=ee7bb740 -0 [001] d.h2 719.672085: irq_handler_entry: irq=30 name=arch_timer -0 [001] d.h3 719.676712: hrtimer_spurious: count:111 -0 [001] d.h2 719.676717: irq_handler_exit: irq=30 ret=handled -0 [001] d..3 719.676726: hrtimer_start: hrtimer=ee7bb740 function=tick_sched_timer expires=719670000000 softexpires=719670000000 -0 [001] d..3 719.676749: hrtimer_cancel: hrtimer=ee7bb740 -0 [001] d.h2 719.677087: irq_handler_entry: irq=30 name=arch_timer -0 [001] d.h3 719.681745: hrtimer_spurious: count:112 -0 [001] d.h2 719.681750: irq_handler_exit: irq=30 ret=handled -0 [001] d..3 719.681759: hrtimer_start: hrtimer=ee7bb740 function=tick_sched_timer expires=719675000000 softexpires=719675000000 -0 [001] d..3 719.681780: hrtimer_cancel: hrtimer=ee7bb740 -0 [001] d.h2 719.682084: irq_handler_entry: irq=30 name=arch_timer -0 [001] d.h3 719.686783: hrtimer_spurious: count:113 -0 [001] d.h2 719.686787: irq_handler_exit: irq=30 ret=handled -0 [001] d..3 719.686797: hrtimer_start: hrtimer=ee7bb740 function=tick_sched_timer expires=719680000000 softexpires=719680000000 -0 [001] d..3 719.686819: hrtimer_cancel: hrtimer=ee7bb740 Normally this happens on a non-spurious interrupt: is-cpu-isolated-1963 [000] d.h1 719.642080: irq_handler_entry: irq=30 name=arch_timer is-cpu-isolated-1963 [000] d.h2 719.642086: hrtimer_cancel: hrtimer=ee7b3740 is-cpu-isolated-1963 [000] d.h1 719.642089: hrtimer_expire_entry: hrtimer=ee7b3740 function=tick_sched_timer now=719635008926 is-cpu-isolated-1963 [000] d.h1 719.642095: softirq_raise: vec=1 [action=TIMER] is-cpu-isolated-1963 [000] d.h1 719.642099: softirq_raise: vec=9 [action=RCU] is-cpu-isolated-1963 [000] d.h2 719.642104: sched_stat_runtime: comm=trace-isolation pid=1963 runtime=348042 [ns] vruntime=7804506025 [ns] is-cpu-isolated-1963 [000] d.h1 719.642111: hrtimer_expire_exit: hrtimer=ee7b3740 is-cpu-isolated-1963 [000] d.h2 719.642115: hrtimer_start: hrtimer=ee7b3740 function=tick_sched_timer expires=719640000000 softexpires=719640000000 is-cpu-isolated-1963 [000] d.h1 719.642121: irq_handler_exit: irq=30 ret=handled .. followed by softirq processing. Another trace point with trace_hrtimer_spurious() might end up being useful :) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/