Received: by 10.192.245.15 with SMTP id i15csp1118173imn; Sun, 11 Mar 2018 03:34:31 -0700 (PDT) X-Google-Smtp-Source: AG47ELuH53TTIKixlhDDlgUshKPHkagiNfddumtip7zub39GYJryMw69xP8blTMSMovWY1J1oWU8 X-Received: by 2002:a17:902:8487:: with SMTP id c7-v6mr4679400plo.143.1520764471590; Sun, 11 Mar 2018 03:34:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1520764471; cv=none; d=google.com; s=arc-20160816; b=D5ms87yrX94zmpzDlzH9oJEGXJFMRqZfeFajR2+iyLNAmJEyZKxS5Ct5pwtgrPiw/m a8POQcUvGI6/PCupA62gM5t0pOehTXVxRGIyIaAAff+U2aNtLoOBEDY5R0gUNhmv2H+l z67+jXhB9aWfwxyp8PrHQut6FUVad9iWZUuDzI2D0EmBK3v1A1IlWOrOQSBZmPiFowcb TuyKZs+wp2SOKqRePaEBfT31M5GgNFPyIpeDQulWwMDX/AWm+Cts1i0wkPZoPkudxNOz V8/9YpcKN3YHELRNiRXTPoyiTjXg9EmPnWOQ67bDnHu7ijjLtWJ5lIjo3uadkays1j4z lmSQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=8rViszz3B90w+HA1Es7G6378xb+AFGMosDud7sNGHfU=; b=T7yw1OHzodijgCAeD2C8tj0M4noiZWVENy3wWtKBABiDwGJjnU3s30pH0weCkxRKWP 8jhDKRa6RxFSlwMR3gfRZZdHFuABtCnEK7SDwwaj3QplX4wDQUpWmDccjbE1ThUgLyFj Pn5WU0EgLC/7cUEltpXQDMKCNXiwvLiC3qfy781I+FfU7S671nkFq58JyeyatwRdeLRI Wqz1TgWCNNd4bZ8zIJsNSp+HYT0LqQRIhWXb/2emlY7FvGCecYe5WfMrIOESJ2ew/qrE 6QDxaL3PMNZ7qsOagqxHpy/qbFGuMYYAfrYGOU9h6eoRVBEkTcRMOds+8qUJsrgv38H2 OGHA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e4si3476906pgv.581.2018.03.11.03.34.17; Sun, 11 Mar 2018 03:34:31 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932187AbeCKKd0 (ORCPT + 99 others); Sun, 11 Mar 2018 06:33:26 -0400 Received: from cloudserver094114.home.pl ([79.96.170.134]:43191 "EHLO cloudserver094114.home.pl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932127AbeCKKdY (ORCPT ); Sun, 11 Mar 2018 06:33:24 -0400 Received: from 79.184.254.228.ipv4.supernova.orange.pl (79.184.254.228) (HELO aspire.rjw.lan) by serwer1319399.home.pl (79.96.170.134) with SMTP (IdeaSmtpServer 0.83) id 00f515a212e47db4; Sun, 11 Mar 2018 11:33:23 +0100 From: "Rafael J. Wysocki" To: Doug Smythies Cc: 'Rik van Riel' , 'Mike Galbraith' , 'Thomas Gleixner' , 'Paul McKenney' , 'Thomas Ilsche' , 'Frederic Weisbecker' , 'Linux PM' , 'Aubrey Li' , 'LKML' , 'Peter Zijlstra' Subject: Re: [RFC/RFT][PATCH v3 0/6] sched/cpuidle: Idle loop rework Date: Sun, 11 Mar 2018 11:34:02 +0100 Message-ID: <9753539.XXhq7nys8q@aspire.rjw.lan> In-Reply-To: <1773378.CTekz94moy@aspire.rjw.lan> References: <2450532.XN8DODrtDf@aspire.rjw.lan> <001801d3b90c$99232600$cb697200$@net> <1773378.CTekz94moy@aspire.rjw.lan> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sunday, March 11, 2018 11:21:36 AM CET Rafael J. Wysocki wrote: > On Sunday, March 11, 2018 8:43:02 AM CET Doug Smythies wrote: > > On 2018.03.10 15:55 Rafael J. Wysocki wrote: > > >On Saturday, March 10, 2018 5:07:36 PM CET Doug Smythies wrote: > > >> On 2018.03.10 01:00 Rafael J. Wysocki wrote: > > > > > ... [snip] ... > > > > > The information that they often spend more time than a tick > > > period in state 0 in one go *is* relevant, though. > > > > > > > > > That issue can be dealt with in a couple of ways and the patch below is a > > > rather straightforward attempt to do that. The idea, basically, is to discard > > > the result of governor prediction if the tick has been stopped alread and > > > the predicted idle duration is within the tick range. > > > > > > Please try it on top of the v3 and tell me if you see an improvement. > > > > It seems pretty good so far. > > See a new line added to the previous graph, "rjwv3plus". > > > > http://fast.smythies.com/rjwv3plus_100.png > > OK, cool! > > Below is a respin of the last patch which also prevents shallow states from > being chosen due to interactivity_req when the tick is stopped. Actually appending the patch this time, sorry. > You may also add a poll_idle() fix I've just posted: > > https://patchwork.kernel.org/patch/10274595/ > > on top of this. It makes quite a bit of a difference for me. :-) > > > I'll do another 100% load on one CPU test overnight, this time with > > a trace. > > Thanks! --- From: Rafael J. Wysocki Subject: [PATCH] cpuidle: menu: Avoid selecting shallow states with stopped tick If the scheduler tick has been stopped already and the governor selects a shallow idle state, the CPU can spend a long time in that state if the selection is based on an inaccurate prediction of idle period duration. That effect turns out to occur relatively often, so it needs to be mitigated. To that end, modify the menu governor to discard the result of the idle period duration prediction if it is less than the tick period duration and the tick is stopped, unless the tick timer is going to expire soon. Signed-off-by: Rafael J. Wysocki --- drivers/cpuidle/governors/menu.c | 26 +++++++++++++++++++------- 1 file changed, 19 insertions(+), 7 deletions(-) Index: linux-pm/drivers/cpuidle/governors/menu.c =================================================================== --- linux-pm.orig/drivers/cpuidle/governors/menu.c +++ linux-pm/drivers/cpuidle/governors/menu.c @@ -297,6 +297,7 @@ static int menu_select(struct cpuidle_dr unsigned long nr_iowaiters, cpu_load; int resume_latency = dev_pm_qos_raw_read_value(device); ktime_t tick_time; + unsigned int tick_us; if (data->needs_update) { menu_update(drv, dev); @@ -315,6 +316,7 @@ static int menu_select(struct cpuidle_dr /* determine the expected residency time, round up */ data->next_timer_us = ktime_to_us(tick_nohz_get_sleep_length(&tick_time)); + tick_us = ktime_to_us(tick_time); get_iowait_load(&nr_iowaiters, &cpu_load); data->bucket = which_bucket(data->next_timer_us, nr_iowaiters); @@ -354,12 +356,24 @@ static int menu_select(struct cpuidle_dr data->predicted_us = min(data->predicted_us, expected_interval); /* - * Use the performance multiplier and the user-configurable - * latency_req to determine the maximum exit latency. + * If the tick is already stopped, the cost of possible misprediction is + * much higher, because the CPU may be stuck in a shallow idle state for + * a long time as a result of it. For this reason, if that happens say + * we might mispredict and try to force the CPU into a state for which + * we would have stopped the tick, unless the tick timer is going to + * expire really soon anyway. */ - interactivity_req = data->predicted_us / performance_multiplier(nr_iowaiters, cpu_load); - if (latency_req > interactivity_req) - latency_req = interactivity_req; + if (tick_nohz_tick_stopped() && data->predicted_us < TICK_USEC_HZ) { + data->predicted_us = min_t(unsigned int, TICK_USEC_HZ, tick_us); + } else { + /* + * Use the performance multiplier and the user-configurable + * latency_req to determine the maximum exit latency. + */ + interactivity_req = data->predicted_us / performance_multiplier(nr_iowaiters, cpu_load); + if (latency_req > interactivity_req) + latency_req = interactivity_req; + } /* * Find the idle state with the lowest power while satisfying @@ -403,8 +417,6 @@ static int menu_select(struct cpuidle_dr */ if (first_idx > idx && drv->states[first_idx].target_residency < TICK_USEC_HZ) { - unsigned int tick_us = ktime_to_us(tick_time); - /* * Find a state with target residency less than the * time to the next timer event including the tick.