Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp3635457rwb; Tue, 20 Sep 2022 02:26:28 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4CgVh4ha0x5UCag/PE9AX8LbwkG+KZp7Kve3cOO9+zSfgyZwpHX7VXN3YCRCnWm/R0obmN X-Received: by 2002:a17:907:a043:b0:781:293d:ea89 with SMTP id gz3-20020a170907a04300b00781293dea89mr8638275ejc.461.1663665988178; Tue, 20 Sep 2022 02:26:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663665988; cv=none; d=google.com; s=arc-20160816; b=MEgZRwqHZ6L/dj8lIiU4Yb1hP4C+k2HBPzuhKaUTGQyJ+Ra7deJNmvcJrbEEciJMGU +GADXUNscBrAyPeIsFLdst5M29dJKK9HuZugEZEFk+JzyCxVjEHRBNPYk8EBZjnMcbRL uA87TV8DP9Q8la2SCIUrRKFLcyUWJfpK9Ll9die8PUwzmwSPWkk7/k//s53Y3cjoo7kH dhcDTSjHGhF5P+tIwvClvRCTnTWcKWah8qKz2z8yOH4cOcthda7fap5sRGg6w4PzaLA5 0iNL1CZ0k8tM1i32FTemhT0GJ3omwNbV9uYK1J8C29+LGJ/EGN7KYZdqMekVkTYiMzqj 0N9Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=pgBE8EOt2mhC0RINdJUVqef/hX4TUk9xU9VV9Qe9+UI=; b=tBs4Zicp84Ijazgazs8SWWEIcJqLZiYLD56zfq0o1Fhexeg0ZPSR8P/Q5YTMy+KwUk 5OX/dWHpIjr27uKmqiCcCDXeMDrCyGw6ljAFSSrBMLGuEIUoOhtbjA/7GsbnegGzRt57 BPzVERLyoPxGEvw99jgzGogVNZm97BrHBPUOP/UhhfLUXc09SVWy23f5nfv+bPQfQiAo sZzMOS1AfAi+fexw0JIFBO6VXDNb/gm/aIXTZ0e3VcI4Rp3l76ZdvSNakLQnTzmw9mM9 oylEEJ1VCPl6J+pGtycHGnIJ/6aMt8FhrhV3EdCXPHw5nD2SCtlRnK3plAAx1BbKGQmb VOFQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id nb38-20020a1709071ca600b00780a4d15a29si956351ejc.49.2022.09.20.02.26.03; Tue, 20 Sep 2022 02:26:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231254AbiITIkl (ORCPT + 99 others); Tue, 20 Sep 2022 04:40:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48396 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231245AbiITIkH (ORCPT ); Tue, 20 Sep 2022 04:40:07 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id C52786B66D; Tue, 20 Sep 2022 01:39:14 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id F334EED1; Tue, 20 Sep 2022 01:39:19 -0700 (PDT) Received: from e126311.manchester.arm.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 67B3A3F73B; Tue, 20 Sep 2022 01:39:11 -0700 (PDT) Date: Tue, 20 Sep 2022 09:39:00 +0100 From: Kajetan Puchalski To: Doug Smythies Cc: rafael@kernel.org, daniel.lezcano@linaro.org, lukasz.luba@arm.com, Dietmar.Eggemann@arm.com, kajetan.puchalski@arm.com, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH 0/1] cpuidle: teo: Introduce optional util-awareness Message-ID: References: <20220915164411.2496380-1-kajetan.puchalski@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, thanks for taking a look! > > This proposed optional extension to TEO would specifically tune it for minimising too deep > > sleeps and minimising latency to achieve better performance. To this end, before selecting the next > > idle state it uses the avg_util signal of a CPU's runqueue in order to determine to what extent the > > CPU is being utilized. This util value is then compared to a threshold defined as a percentage of > > the cpu's capacity (capacity >> 6 ie. ~1.5% in the current implementation). > > That seems quite a bit too low to me. However on my processor the > energy cost of using > idle state 0 verses anything deeper is very high, so I do not have a > good way to test. I suppose it does look low but as I said, at least from my own testing higher thresholds result in completely nullifying the potential benefits from using this. It could be because with a low-enough threshold like this we are able to catch the average util as it starts to rise and then we're already in the 'low-latency mode' by the time it gets higer as opposed to correcting after the fact. We could also always make it into some kind of tunable if need be, I was testing it with a dedicated sysctl and it worked all right. > > Processor: Intel(R) Core(TM) i5-10600K CPU @ 4.10GHz > On an idle system : > with only Idle state 0 enabled, processor package power is ~46 watts. > with only idle state 1 enabled, processor package power is ~2.6 watts > with all idle states enabled, processor package power is ~1.4 watts > Ah I see, yeah this definitely won't work on systems with idle power usage like above. It was designed for Arm devices like the Pixel 6 where C0 is so power efficient that running with only C0 enabled can sometimes actually use *less* power than running with all idle states enabled. This was for non-intensive workloads like PCMark Web Browsing where there were enough too deep sleeps in C1 to offset the entire power saving. The entire idea we're relying upon here is C0 being very good to begin with but wanting to still use *some* C1 in order to avoid bumping into thermal issues. > > If the util is above the > > threshold, the governor directly selects the shallowest available idle state. If the util is below > > the threshold, the governor defaults to the TEO metrics mechanism to try to select the deepest > > available idle state based on the closest timer event and its own past correctness. > > > > Effectively this functions like a governor that on the fly disables deeper idle states when there > > are things happening on the cpu and then immediately reenables them as soon as the cpu isn't > > being utilized anymore. > > > > Initially I am sending this as a patch for TEO to visualize the proposed mechanism and simplify > > the review process. An alternative way of implementing it while not interfering > > with existing TEO code would be to fork TEO into a separate but mostly identical for the time being > > governor (working name 'idleutil') and then implement util-awareness there, so that the two > > approaches can coexist and both be available at runtime instead of relying on a compile-time option. > > I am happy to send a patchset doing that if you think it's a cleaner approach than doing it this way. > > I would prefer the two to coexist for testing, as it makes it easier > to manually compare some > areas of focus. That would be my preference as well, it just seems like a cleaner approach despite having to copy over some code to begin with. I'm just waiting for Rafael to express a view one way or the other :) > > At the very least this approach seems promising so I wanted to discuss it in RFC form first. > > Thank you for taking your time to read this! > > There might be a way forward for my type of processor if the algorithm > were to just reduce the idle > depth by 1 instead of all the way to idle state 0. Not sure. It seems > to bypass all that the teo > governor is attempting to achieve. Oh interesting, that could definitely be worth a try. As I said, this was designed for Arm CPUs and all of the targeted ones only have 2 idle states, C0 and C1. Thus reducing by 1 and going all the way to 0 are the same thing for our use case. You're right that this is potentially pretty excessive on Intel CPUs where you could be going from state 8/9 to 0. It would result in some wasted cycles on Arm but I imagine there should be some way forward where we could accommodate the two. > For a single periodic workflow at any work sleep frequency (well, I > test 5 hertz to 411 hertz) and very > light workload: Processor package powers for 73 hertz work/sleep frequency: > > teo: ~1.5 watts > menu: ~1.5 watts > util: ~19 watts > > For 12 periodic workflow threads at 73 hertz work/sleep frequency > (well, I test 5 hertz to 411 hertz) and very > workload: Processor package powers: > > teo: ~2.8watts > menu: ~2.8 watts > util: ~49 watts > > My test computer is a server, with no gui. I started a desktop linux > VM guest that isn't doing much: > > teo: ~1.8 watts > menu: ~1.8 watts > util: ~7.8 watts Ouch that's definitely not great, really good to know what this looks like on Intel CPUs though. Thanks a lot for taking your time to test this out! > > > > -- > > Kajetan > > > > [1] https://github.com/mrkajetanp/lisa-notebooks/blob/a2361a5b647629bfbfc676b942c8e6498fb9bd03/idle_util_aware.pdf > > > > > > Kajetan Puchalski (1): > > cpuidle: teo: Introduce optional util-awareness > > > > drivers/cpuidle/Kconfig | 12 +++++ > > drivers/cpuidle/governors/teo.c | 86 +++++++++++++++++++++++++++++++++ > > 2 files changed, 98 insertions(+) > > > > -- > > 2.37.1 > >