Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751825AbaKKRPi (ORCPT ); Tue, 11 Nov 2014 12:15:38 -0500 Received: from mail-wg0-f44.google.com ([74.125.82.44]:50226 "EHLO mail-wg0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751683AbaKKRPg (ORCPT ); Tue, 11 Nov 2014 12:15:36 -0500 Date: Tue, 11 Nov 2014 18:15:28 +0100 From: Frederic Weisbecker To: Christoph Lameter Cc: "Paul E. McKenney" , Viresh Kumar , Thomas Gleixner , Linux Kernel Mailing List , Gilad Ben-Yossef , Tejun Heo , John Stultz , Mike Frysinger , Minchan Kim , Hakan Akkan , Max Krasnyansky , Hugh Dickins , "H. Peter Anvin" , Ingo Molnar , Peter Zijlstra , Kevin Hilman Subject: Future of NOHZ full/isolation development (was Re: [NOHZ] Remove scheduler_tick_max_deferment) Message-ID: <20141111171526.GC3216@lerouge> References: <20141110153147.GK4901@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 10, 2014 at 12:26:51PM -0600, Christoph Lameter wrote: > > > > Would it make sense for unlimited max deferment to be available as > > a boot parameter? That would allow people who want tick-free execution > > more than accurate stats to get that easily, while keeping stats accurate > > for everyone else. > > Subject: Make the maximum tick deferral for CONFIG_NO_HZ configurable > > Add a way to configure this interval at boot and via > /proc/sys/vm/max_defer_tick > > Signed-off-by: Christoph Lameter Sorry but that's not solving the problem. All it does is to allow the user to tune bugs. Kevin Hilman proposed something similar using debugfs and I declined it as well. Integrating a hack like this is a good way to make sure that nobody will ever fix the real underlying issue. BTW, that's a good opportunity for me to generalize this case to the full dynticks development general issue. I got a lot of help from people to improve the kernel's isolation and full dynticks: Paul has spent a lot of time to improve RCU, you improved vmstat, full dynticks got ported to other archs, people like Viresh fixed some timers internals, Gilad fixed IPIs, Peterz reviewed a lot, etc... But now we reached a step where there are mostly core issues remaining that require some infrastrure change investments, some extensions or a bit of rethinking. We know we reach that step when people who want the features are stuck sending workarounds. Nothing like big rewrites is needed really, actually just a bunch of pretty self contained issues. And by self-contained I mean that each of these individual problems can be worked out seperately as they are unrelated enough altogether. Here is a summarized list: * Unbound workqueues affinity (to housekeeper) * Unbound timers affinity (to housekeeper) * 1 Hz residual scheduler tick offlining to housekeeper * Fix some scheduler accounting that don't even work with 1 Hz: cpu load accounting, rt_scale, load balancing, etc... * Lighten the syscall path and get rid of cputime accounting + RCU hooks for people who want isolation + fast syscalls and faults. * Work on non-affinable workqueues * Work on non-affinable timers * ... If I'm going to work alone on all that, this is going to take several years, honestly. But we know what to do and how. So all we need is (at least one) more full time core developer to get these things done in a reasonable amount of time. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/