Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756218AbdGKSLQ (ORCPT ); Tue, 11 Jul 2017 14:11:16 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:49530 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1756169AbdGKSLO (ORCPT ); Tue, 11 Jul 2017 14:11:14 -0400 Date: Tue, 11 Jul 2017 11:11:08 -0700 From: "Paul E. McKenney" To: Frederic Weisbecker Cc: Aubrey Li , tglx@linutronix.de, peterz@infradead.org, len.brown@intel.com, rjw@rjwysocki.net, ak@linux.intel.com, tim.c.chen@linux.intel.com, arjan@linux.intel.com, yang.zhang.wz@gmail.com, x86@kernel.org, linux-kernel@vger.kernel.org, Aubrey Li Subject: Re: [RFC PATCH v1 04/11] sched/idle: make the fast idle path for short idle periods Reply-To: paulmck@linux.vnet.ibm.com References: <1499650721-5928-1-git-send-email-aubrey.li@intel.com> <1499650721-5928-5-git-send-email-aubrey.li@intel.com> <20170711125847.GA13265@linux.vnet.ibm.com> <20170711163353.GB18805@lerouge> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170711163353.GB18805@lerouge> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 17071118-0048-0000-0000-000001C13507 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007350; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000214; SDB=6.00886123; UDB=6.00442292; IPR=6.00666272; BA=6.00005468; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00016181; XFM=3.00000015; UTC=2017-07-11 18:11:12 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17071118-0049-0000-0000-000041D98C6A Message-Id: <20170711181108.GQ2393@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-07-11_09:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1706020000 definitions=main-1707110292 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2257 Lines: 61 On Tue, Jul 11, 2017 at 06:33:55PM +0200, Frederic Weisbecker wrote: > On Tue, Jul 11, 2017 at 05:58:47AM -0700, Paul E. McKenney wrote: > > On Mon, Jul 10, 2017 at 09:38:34AM +0800, Aubrey Li wrote: > > > From: Aubrey Li > > > > > > The system will enter a fast idle loop if the predicted idle period > > > is shorter than the threshold. > > > --- > > > kernel/sched/idle.c | 9 ++++++++- > > > 1 file changed, 8 insertions(+), 1 deletion(-) > > > > > > diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c > > > index cf6c11f..16a766c 100644 > > > --- a/kernel/sched/idle.c > > > +++ b/kernel/sched/idle.c > > > @@ -280,6 +280,8 @@ static void cpuidle_generic(void) > > > */ > > > static void do_idle(void) > > > { > > > + unsigned int predicted_idle_us; > > > + unsigned int short_idle_threshold = jiffies_to_usecs(1) / 2; > > > /* > > > * If the arch has a polling bit, we maintain an invariant: > > > * > > > @@ -291,7 +293,12 @@ static void do_idle(void) > > > > > > __current_set_polling(); > > > > > > - cpuidle_generic(); > > > + predicted_idle_us = cpuidle_predict(); > > > + > > > + if (likely(predicted_idle_us < short_idle_threshold)) > > > + cpuidle_fast(); > > > > What if we get here from nohz_full usermode execution? In that > > case, if I remember correctly, the scheduling-clock interrupt > > will still be disabled, and would have to be re-enabled before > > we could safely invoke cpuidle_fast(). > > > > Or am I missing something here? > > That's a good point. It's partially ok because if the tick is needed > for something specific, it is not entirely stopped but programmed to that > deadline. > > Now there is some idle specific code when we enter dynticks-idle. See > tick_nohz_start_idle(), tick_nohz_stop_idle(), sched_clock_idle_wakeup_event() > and some subsystems that react differently when we enter dyntick idle > mode (scheduler_tick_max_deferment) so the tick may need a reevaluation. > > For now I'd rather suggest that we treat full nohz as an exception case here > and do: > > if (!tick_nohz_full_cpu(smp_processor_id()) && likely(predicted_idle_us < short_idle_threshold)) > cpuidle_fast(); > > Ugly but safer! Works for me! Thanx, Paul