Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751889AbeAPQwq (ORCPT + 1 other); Tue, 16 Jan 2018 11:52:46 -0500 Received: from mx1.redhat.com ([209.132.183.28]:54640 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751674AbeAPQwn (ORCPT ); Tue, 16 Jan 2018 11:52:43 -0500 Date: Tue, 16 Jan 2018 11:52:11 -0500 From: Luiz Capitulino To: Frederic Weisbecker Cc: Ingo Molnar , LKML , Peter Zijlstra , Chris Metcalf , Thomas Gleixner , Christoph Lameter , "Paul E . McKenney" , Wanpeng Li , Mike Galbraith , Rik van Riel Subject: Re: [GIT PULL] isolation: 1Hz residual tick offloading v3 Message-ID: <20180116115211.7fd55c9a@redhat.com> In-Reply-To: <20180116154055.GA27042@lerouge> References: <1515039937-367-1-git-send-email-frederic@kernel.org> <20180112141813.32dcc84d@redhat.com> <20180116154055.GA27042@lerouge> Organization: Red Hat MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Tue, 16 Jan 2018 16:52:43 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On Tue, 16 Jan 2018 16:41:00 +0100 Frederic Weisbecker wrote: > On Fri, Jan 12, 2018 at 02:18:13PM -0500, Luiz Capitulino wrote: > > On Thu, 4 Jan 2018 05:25:32 +0100 > > Frederic Weisbecker wrote: > > > > > Ingo, > > > > > > Please pull the sched/0hz branch that can be found at: > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git > > > sched/0hz > > > > > > HEAD: 9e932b2cc707209febd130978a5eb9f4a943a3f4 > > > > > > -- > > > Now that scheduler_tick() has become resilient towards the absence of > > > ticks, current->sched_class->task_tick() is the last piece that needs > > > at least 1Hz tick to keep scheduler stats alive. > > > > > > This patchset adds a flag to the isolcpus boot option to offload the > > > residual 1Hz tick. This way the nohz_full CPUs don't have anymore tick > > > (assuming nothing else requires it) as their residual 1Hz tick is > > > offloaded to the housekeepers. > > > > > > For quick testing, say on CPUs 1-7: > > > > > > "isolcpus=nohz_offload,domain,1-7" > > > > Sorry for being very late to this series, but I've a few comments to > > make (one right now and others in individual patches). > > > > Why are extending isolcpus= given that it's a deprecated interface? > > Some people have already moved away from isolcpus= now, but with this > > new feature they will be forced back to using it. > > I tried to remove isolcpus or at least change the way it works so that its > effects are reversible (ie: affine the init task instead of isolating domains) > but that got nacked due to the behaviour's expectations for userspace. > > That's when I realized that kernel parameters are like userspace ABIs, > they can't be removed easily whether we deprecate them or not. > > Also I needed to be able to control the various isolation features, and > nohz_full is the wrong place to do that as nohz_full is really just an > isolation feature like the others, nohz_full= should really just imply > full dynticks and not watchdog, workqueue or tilegx NAPI isolation... Yeah, I completely agree with that. > So isolcpus= is now the place where we control the isolation features > and nohz is one of them. That's the part I'm not very sure about. We've been advising users to move away from isolcpus= when possible, but this very wanted nohz_offload feature will force everyone back to using isolcpus= again. I have the impression this series is trying to solve two problems: 1. How (and where) we control the various isolation features in the kernel 2. Where we add the control for the tick offload feature I think item 1 is too complex to solve right now. IMHO, this series should focus on item 2. And regarding item 2, I think we have two choices to make: 1. Make tick offload a first class citizen by making it default to nohz_full=. If there are regressions, we handle them 2. Add a new option to nohz_full=, like nohz_full=tick_offload As an avid user of nohz_full I'm dying to see option 1 happening, but I'm not totally sure what the consequences can be. Another idea is to add CONFIG_NOHZ_TICK_OFFLOAD as an experimental feature. > The complain about isolcpus is the immutable result. I'm thinking about > making it modifiable to cpuset but I only see two possible solutions: > > - Make the root cpuset modifiable > - Create a directory called "isolcpus" visible on the first cpuset mount > and move all processes there. So, if we move the control of the tick offload to nohz_full= itself, we can completely ditch any isolcpus= change in this series. I think this should give you a great relief :) > > What about just adding the new functionality to nohz_full=? That is, > > no new options, just make the tick go away since this has always been > > what nohz_full= was intended to do? > > We can, or have isolcpus=nohz to do it, as both do almost the same. > > But I'm afraid about the overhead for people used to nohz_full= once > they upgrade their kernels and see those workqueues once per second. > > We can still affine those workqueues (in fact the whole unbound workqueue > mask) outside the nohz_full range. Still current users may be surprised > about that new overhead on housekeeping CPUs... >