Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752463AbcDUOmY (ORCPT ); Thu, 21 Apr 2016 10:42:24 -0400 Received: from merlin.infradead.org ([205.233.59.134]:39366 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751565AbcDUOmX (ORCPT ); Thu, 21 Apr 2016 10:42:23 -0400 Date: Thu, 21 Apr 2016 16:42:13 +0200 From: Peter Zijlstra To: Wanpeng Li Cc: Rik van Riel , Chris Metcalf , Frederic Weisbecker , Christoph Lameter , Ingo Molnar , Luiz Capitulino , Thomas Gleixner , Viresh Kumar , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH] nohz_full: Make sched_should_stop_tick() more conservative Message-ID: <20160421144213.GN3408@twins.programming.kicks-ass.net> References: <1459539771-4251-1-git-send-email-cmetcalf@mellanox.com> <1459797143.6219.22.camel@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1256 Lines: 33 On Mon, Apr 18, 2016 at 10:00:42AM +0800, Wanpeng Li wrote: > > H is for hierarchy. That counts the total of runnable tasks in the > > entire child hierarchy. Nr_running is the number of se entities in > > the current tree. > > So I think we should at least change cfs_rq->nr_running to > cfs->h_nr_running, I can send a formal patch if you think it makes > sense. :-) > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 1159423..79197df 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -616,7 +616,7 @@ bool sched_can_stop_tick(struct rq *rq) > } > > /* Normal multitasking need periodic preemption checks */ > - if (rq->cfs.nr_running > 1) > + if (rq->cfs.h_nr_running > 1) > return false; > > return true; So I think that is indeed the right thing here. But looking at this function I think there's more problems with it. It seems to assume that if there's FIFO tasks, those will run. This is incorrect. The FIFO task can have a lower prio than an RR task, in which case the RR task will run. So the whole fifo_nr_running test seems misplaced, it should go after the rr_nr_running tests. That is, only if !rr_nr_running, can we use fifo_nr_running like this.