Received: by 10.223.164.202 with SMTP id h10csp5402981wrb; Tue, 21 Nov 2017 09:10:32 -0800 (PST) X-Google-Smtp-Source: AGs4zMaulFRNFiHNui5U2eM9Hi+tyDIat0tEuGrFKWqRHV5V+u4XFMmLoYNWPuncUY0EegG3OIKF X-Received: by 10.99.120.195 with SMTP id t186mr565299pgc.62.1511284231950; Tue, 21 Nov 2017 09:10:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1511284231; cv=none; d=google.com; s=arc-20160816; b=X4oT+hXQgSO+vaFiKFGlSRxOKqVLUddwIEEFVdWi+U2ygKZtW92M3ChnvONe1pTsm8 8ZQ5w7o5SWVZ7Ia2p1XALHC/2dIYD/AFqnb6A9bCwM20/eYKrp7V39vFGctkoqqAHKFv m7dQhlV5htE4YhRNLOoiCRpNf3TxsiE5aEn4HVOhnc1FF52/08qrdsqbMdTLwjPbTmTE D5tZAgUpDvB/iEnBm8YCBB3DitU5/lr0yiyyDOpLsNV94QWe8vbRbUBXjsjBOyR6rBS1 +OlTYP1WbrgDA1xrF9ATZFV9yd2qegLzZ/EDBmTMxwSDZ8wSUBtDhzd9+cejaSAWqcIk /Unw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=NiaIgkZ0aZu2jMt4baeehmHbJgqjFu9j5CCLZOE81fg=; b=tFaRx2As3rXCgyw5WiehWM700uRTLuwSDpQ+wSbCvlNKnOJPRiHAAgCWzSNYcLx3+v feaUoN0zOPESccDO+fcIiVLZ2JKdQwrMorrrHbd9GpxfsOqNQ6kQgNuuVu/M2rOxjvxo JtJuUrWq9Uu5BHTcid1DiT4BTZdxwzCP20ea6r8uBMqoM8lwyvI1n5/cg6UPF9fesGmE eyqCta08p2HxNyfhHvAwI+joQwpdoOMP7tJO5swLxarDCWPmN9o+F9pNB+LKw9uvLW6+ thThFsmU/HNT063iU3HqzS8yY/MMpK3Fq+tebrh0fuxnViFZq+daOEC77W5ysBsjbiM2 QUjA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Q0reWstP; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f2si11934033plk.121.2017.11.21.09.10.20; Tue, 21 Nov 2017 09:10:31 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Q0reWstP; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751479AbdKURJ0 (ORCPT + 75 others); Tue, 21 Nov 2017 12:09:26 -0500 Received: from mail-pg0-f50.google.com ([74.125.83.50]:38798 "EHLO mail-pg0-f50.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751145AbdKURJY (ORCPT ); Tue, 21 Nov 2017 12:09:24 -0500 Received: by mail-pg0-f50.google.com with SMTP id s11so10637950pgc.5; Tue, 21 Nov 2017 09:09:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=NiaIgkZ0aZu2jMt4baeehmHbJgqjFu9j5CCLZOE81fg=; b=Q0reWstPxrkNbgyeEEzlCayFdobV6To5AilcQ84qmTnadCxHhFFHf3kfZmdCQivAMt mV3tGvBrfMLp0ZLxzksqto5FidaB/8KFnP71ffg9PHAIg2bXetPxU+ayv18I1s8SKbSx gc0EWdUDGKnBTWE//N4W/7cn9G3qCduMJun8l+8R2SFEQz/rl6qoCQyK3b6J4RTbFtvb ZsIksJpwBAnnzkMhLLSfsLAtNvpWAZ08WPyqxYrT6XtpJs/i8dH5pm0BB9tBxWV5UaxA u5ghQDDiiPSNQKErc/EwvmqC3V9+98EtkuNofMvObLnZD0x80rjhjX8vXovh7gjux52l +Ttw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=NiaIgkZ0aZu2jMt4baeehmHbJgqjFu9j5CCLZOE81fg=; b=Yk6L1KHDqlpcVyFjgVxR1qMrUB0PN3XJeznoPtGkQIUzez5Gw1CM9ckuCbWhOVItdU bPKf8XDOBSm0jPBw98HjH2En9T4M98W9dxwea7frRC+owGM5594lOAKhk9TvWKzm0m28 TZSejBcUQtVpGK0NrfGRBwND4LHRIZxqSNIduZGOVdUmx00An1zqkbkMpchlUo5RbXOP wo9GvfGAqr96lHuOrLI0T4nBM98FcY2ZVwlmHrVS2iQT9uTuCzIBNCkUBWRX2WJmmYjW cmQYmU4bYZf/376RZqHGwSoTE/AqxkvIpPuoIVWJ4c3wKxpRQYljDu+NHVdcno/6Pw6u z4Lw== X-Gm-Message-State: AJaThX4DVqX8x8zEyLMj9c8ql5re35b39FlUpqldmb6gyySKoYuYCWhP 13tZMEL9ALEqpReLePUF4RA= X-Received: by 10.99.163.25 with SMTP id s25mr17912496pge.310.1511284163298; Tue, 21 Nov 2017 09:09:23 -0800 (PST) Received: from localhost.localdomain ([2601:644:8201:32e0:7256:81ff:febd:926d]) by smtp.gmail.com with ESMTPSA id a78sm25972295pfl.155.2017.11.21.09.09.21 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 21 Nov 2017 09:09:22 -0800 (PST) Date: Tue, 21 Nov 2017 09:09:19 -0800 From: Eduardo Valentin To: Ionela Voinescu Cc: Punit Agrawal , "Rafael J. Wysocki" , Viresh Kumar , "Rafael J. Wysocki" , Daniel Lezcano , Amit Daniel Kachhap , Javi Merino , Zhang Rui , Steven Rostedt , Ingo Molnar , Linux PM , Vincent Guittot , lukasz.luba@arm.com, Linux Kernel Mailing List , Javi Merino Subject: Re: [PATCH 4/4] cpu_cooling: Drop static-power related stuff Message-ID: <20171121170917.GB2499@localhost.localdomain> References: <65fef2a1-d23f-3de2-bd91-021296c3e2f7@arm.com> <20171116152058.GR3257@vireshk-i7> <2810372.fJ2vMN0cWO@aspire.rjw.lan> <20171116234422.GA6141@localhost.localdomain> <878tf5tbfj.fsf@e105922-lin.cambridge.arm.com> <35d3751d-f28d-38c2-02b2-c9980f11c52e@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <35d3751d-f28d-38c2-02b2-c9980f11c52e@arm.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On Tue, Nov 21, 2017 at 11:30:44AM +0000, Ionela Voinescu wrote: > Hi guys, > > On 17/11/17 11:02, Punit Agrawal wrote: > > Hi Eduardo, > > > > Eduardo Valentin writes: > > > >> Hello, > >> > >> On Fri, Nov 17, 2017 at 12:31:41AM +0100, Rafael J. Wysocki wrote: > >>> On Thursday, November 16, 2017 4:20:58 PM CET Viresh Kumar wrote: > >>>> On 16-11-17, 15:02, Ionela Voinescu wrote: > >>>>> When it was added in lsk 3.18 in what was then a thermal driver for Juno > >>>>> it was believed to have an effect in thermal mitigation, but that was > >>>>> not proven later as to justify posting it upstream, and that is why the > >>>>> code never made it in mainline. > >> > >> > >> Really? Do you have data that can be shared to back up the above > >> statement? > >> > >> Last time I checked, not only in Juno, static power does have > >> a non-negligible contribution. Neglecting static power can essentially > >> make IPA to undershoot in many cases to eventually overshoot. And that > >> is what I recollect from the data that I was presented, even for getting > >> this code reviewed. Just do not recollect to have the link to it. > > > > Just to make sure we are on the same page - static power can be a > > significant portion of SoC power consumption. It varies widely across > > SoCs and from experience depends on things like fabrication process, > > ambient temperature, applied voltage/current drain, etc. There are many > > SoCs where static power is a significant part of power consumption and > > needs to be modelled for good thermal control. > > > > Specifically in the case of Juno - we'd done some thermal and > > performance benchmarking when working on IPA. This included implementing > > a static power model to understand and test it's impact. If memory > > serves me right static power was approximately in the 10-15% > > range. Unfortunately, I can't find any datasets to support or disprove > > this claim. > > > > My take away from the Juno experiment was that it is not a particularly > > thermally constrained system. At the least it took many 10s of seconds > > running at max frequency (both clusters and the GPU) with the case fan > > turned off for it to see a 10C increase. So the lack of a static power > > model wasn't affecting it's thermal control. > > > > But this situation is only true for Juno. More below... > > > > Sorry, I did not mean to say that static power is irrelevant. I only > specified that for Juno, the values used in the lsk3.18 kernel did not > have a significant effect in thermal mitigation, as Punit detailed here > and below. > > Talking generically, IPA uses approximate values for dynamic power > consumption (due to approximate values for the dynamic power > coefficient, frequency at the end of the analysis window, an approximate > value for utilization), approximate values for sustainable power in > within a thermal zone, and approximate values for static power. A lot of > my own experiments so far showed that IPA can compensate very well for > inaccuracies in all of these due to the PID control loop but it pays the > price in the stability of its signal when they are way off. > > As you also mention, when the accuracy of these values is neglected this > can result in an inefficient ramp up and seesaw effects (undershoots and > overshoots). That's why is very important for IPA to be properly tuned > and I would not suggest that any of these should be neglected, but to be > as accurate as possible. > > But there is only so much accuracy that you can achieve given the high > cost of added complexity: static power and the dynamic power coefficient > depend on PVT which sometimes cannot easily be factored in, the > utilization that scales the dynamic power cannot be easily tracked > especially for clusters of CPUs and given inaccurate estimations of idle > states, etc. > > This being said, I believe there are platforms out there where the > static power cost might be much too expensive to model for the gain it > brings in the stability and accuracy of IPA power request estimation and > allocation. Also, as you pointed out, there is reluctance in sharing > these models. > > >> > >>>>> The code added there can be found at: > >>>>> https://git.linaro.org/landing-teams/working/arm/kernel-release.git/tree/drivers/thermal/scpi-thermal.c?h=lsk-3.18-armlt#n247 > >>>>> > >>>>> As for removing this code now, I would not want to make that decision without > >>>>> spending more time to check if it impacts other customer codelines. > >>>>> > >> > >> Maybe this is a matter of Linaro/ARM failing to convince SoC vendors to > >> really publish static power to mainline Linux instead of really having > >> the benefit of modeling it? Even simple models based on temperature > >> ranges would be better than only using dynamic power model. > > > > I know of a few SoCs implementing static power model in their kernel > > (not mainline). It would be great for that code to hit mainline. But as > > with all the out of tree code, I'm not sure have much influence we have > > in making that happen. > > > > Again, I don't think we are arguing about the importance of static power > > in a power model based thermal control strategy like IPA. > > > >> > >>>>> I'll come back with a reply to this in the next couple of days. > >>>> > >>>> Sure, we can wait for few days definitely :) > >>> > >>> While the above certainly is true, it doesn't matter whether or not any > >>> out-of-the-tree code will be affected by removing this from the mainline. > >>> > >>> What matters is *only* whether or not anyone is going to add anything > >>> depending on it to the mainline any time soon. If that's not the case, > >>> the stuff goes away (and may be added back in the future if need be). > >>> > >>> To avoid delaying this indefinitely, let's make a deal as follows. > >>> > >>> Unless anyone with some changes targeted at the mainline and depending on the > >>> code in question shows up before the end of the merge window currently under > >>> way, I will queue up the patches from Viresh for 4.16. Then, there will be > >>> 8 weeks (or more) of time before the 4.16 merge window opens to drop them if > >>> any new material depending on the code removed by them materializes in the > >>> meantime. > >> > >> > >> Sure, as I mentioned before, if we failed to convince SoC manufacturers > >> to provide valid models, makes no sense to keep dead code in the tree, > >> you have my support and you can add my: > >> > >> Acked-by: Eduardo Valentin > >> > >> I am just not convinced that this is really about the static power > >> being negligible for IPA. > > > > Static power is definitely significant for IPA as is the accuracy of > all other values that contribute to the power request and allocation > calculations. > > For me, part the decision or whether to remove or to keep this code is > about how much use we can make of it now or in the future, as it stands. > > Information on static power, depending on platform and achievable > accuracy, can be as simple as a DT value (I would definitely not > recommend this to be the only supported source), a more complex callback > or maybe a value provided by firmware where the mechanism to obtain that > value is hidden. > A DT model would be easy to support with the current code but it would > be very inaccurate. > More complex algorithms provided as callbacks should give the possibility > for platform specific customization which lacks at the moment. But even > if it was supported, SoC manufactures are usually reluctant to share that > information. > Passing information from firmware would allow for platform specific > customization as well as hiding the mechanism through which is obtained, > but there's no standard way to obtain this value at the moment (probably > can be added to the OPP framework in the future). > > Another factor to consider is the imbalance between cpufreq and devfreq > cooling devices (devfreq cooling devices are still able to provide > static power information) that removing this code creates. > > Therefore, I would rather see this interface extended and not removed > completely, in order to give users the possibility to link a source of > information more appropriate for them. And it should all start with > support for one platform. But I can't volunteer my time in short term > to make these changes. So I can ack these patches for now, as the > justification for cleaning this code is correct, but sooner or later > better support for providing static power information should and will > be added. Well, I really do not see the point of a ask to extend the current API if have no single user of it. What are the current users problems with the API? What needs to be improved? What are the problems? We cannot tell, guess why? We have no users of it! Once again, do we have a reference platform? Oh, yes, that is Juno, and not even that is in mainline code. Folks, this can be a very nice discussion on how we can over engineer this API, but being pragmatic and avoiding wasting our time here, we all know what needs to be done. Dead code in mainline is hard to maintain. API to support out of the tree code is even harder. I am surprised to see this code was able to sustain itself in mainline with none challenging it for 2.5 y. So, we either get you guys to upstream at least one user of it or we just move on and remove the API, and keep mainline with only dynamic power, with periodic undershoots and overshoots. It is a simple decision: you either mainline it or keep IPA MORE inaccurate and take the burden to keep vendor own implementation of APIs for static power model on each product cycle, you choose. BR, Eduardo > > Best regards, > Ionela. > > > Just to reiterate once more, we are in agreement here. :) > > > > I'd like to think this patchset has come out of a plan to develop > > functionality on top but I don't know if that is the case. > > > >> > >>> > >>> Thanks, > >>> Rafael > >>> > > From 1584695990312546568@xxx Tue Nov 21 17:04:25 +0000 2017 X-GM-THRID: 1584123420538137355 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread