Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp719223imm; Wed, 10 Oct 2018 03:15:34 -0700 (PDT) X-Google-Smtp-Source: ACcGV61ML7CKn1kwgJzE9XLU4zGRGZg04G8s5YNbpxEFeyJz98f7OcbQYjN+cM4oFupP0r8CcyMs X-Received: by 2002:a63:6ec4:: with SMTP id j187-v6mr29660246pgc.3.1539166534047; Wed, 10 Oct 2018 03:15:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539166534; cv=none; d=google.com; s=arc-20160816; b=aWIDfwS4GH+v8IuFvOJZVqzOUysXxMLzaKeKtka7q4Glc1fzic/ZOadPow8squS9gV FCdXz8SNbvE+R7I+Y/YlUimb046UcVp3svKGDi0Nhn7alfTmUWpbtVnJiAD4ax5LEUvR gVZMkXUOmBXtCyTJ0faA6iwdFoVJMRxR9ON9pW/bIJ0aRjlzL5zc0ReQ3ZyRCRl5GDra 8RjpFEjFurFn+8zjjUf9dpsjRZiVTvA5PlH8189VKnPWQ0JQ7jW/CHTtGIM83Pgkcik1 wWJb7ST1dPDqyW1AS4l9LQhfA/5vHBrpnl96ua5Du/XhgBgO+x2jqFX2frXjdg3WpgLF 0nFQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=yDbLix6E4zGm8ztBoDBEuXfpaANAhua1pwSayu5SJPY=; b=DxgzJNQrxQmNdg3uOE+qhTL4AxNAn5LEFqe8UTiLEpEH2fVMYtdwBwiAslD33JZs61 8VDJh5b3D508OUpqbGF2utYw4s/47/+vhuXXgHQghbCfk46hEIl+FvLxMixHth+7PTk8 i6wKJNbNQo4Ln1DknF/RFBJGdQLREoErOrYR36g5KqhkEKkbCUFdOJmGWuHER5HSLnUD X+R4sMx8r+LHrJ8+Hing6Tr6dCV8I6MSWDtzezgM8XX1jbb1IXo8ikgoN7q0wfzcaDN8 p9mX0uZiFLf0mxTjRdZS+3dQmxwVX5bBheUKcXycD0inhnO+kS6PMm90GgripSoFfiSC Pz9g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=AbYzbvwS; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d38-v6si24584830pla.422.2018.10.10.03.15.18; Wed, 10 Oct 2018 03:15:33 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=AbYzbvwS; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727132AbeJJRgN (ORCPT + 99 others); Wed, 10 Oct 2018 13:36:13 -0400 Received: from mail-io1-f68.google.com ([209.85.166.68]:40042 "EHLO mail-io1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726741AbeJJRgM (ORCPT ); Wed, 10 Oct 2018 13:36:12 -0400 Received: by mail-io1-f68.google.com with SMTP id w16-v6so3436809iom.7 for ; Wed, 10 Oct 2018 03:14:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=yDbLix6E4zGm8ztBoDBEuXfpaANAhua1pwSayu5SJPY=; b=AbYzbvwSgSUvQti3NJxqN4gVptBpJ3ONUrGFaq/QGK4oCxd4tjYQo9JqKeOIAKZ5G9 EiOwyF8ruwm5ftu1HUpO78tsSdkNc+KyXXjXCMO+qMSUgIkymEmrjDvxJ4Mqu8lDbUk1 aNsShjihZ8ounyZ6/8X1spy1rP0djLbnqd7x8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=yDbLix6E4zGm8ztBoDBEuXfpaANAhua1pwSayu5SJPY=; b=oIyqeZiVDznQzPYbQu8Mgduhn2wv2H0AsrCitnVTMN6tuFr+lR1VKwbARWv7Wj3fTo 1WKqsQVkPLupWDlFHDXRbJQQny4xc7PMnTwCi7A6T32L+fKi2CBfnArDwOS7COM+Gkxc ZSc6ko4dDQv8LSVqPPbWMHrZX2ZdrlE/P0U1xZjaiX44/5izq4X/RRynpT7jEK2355Ib cbGg2pSpaRwjxlp0brJ7oeB7CKn/dZDW8/dmdJU5HkhealQXG1LtwtgFWLZJcMQGb1rW fed6SF8wOmukupinD07/DTVWfihPMf5PtpQIfEPMxqdKlRUgBIJeJrPGj02fCEZ+9COJ w+9w== X-Gm-Message-State: ABuFfojepfMHJFGFL8nCwQhVWhIBJFC9NVLmYEXqGwtW6E4GKEonPyYn bisRFCHeAy6iwtGw0uiPLeIiSUG+y5iGQ7eefv6RyA== X-Received: by 2002:a6b:254:: with SMTP id 81-v6mr16276487ioc.183.1539166483957; Wed, 10 Oct 2018 03:14:43 -0700 (PDT) MIME-Version: 1.0 References: <1539102302-9057-1-git-send-email-thara.gopinath@linaro.org> <20181010061751.GA37224@gmail.com> <20181010082933.4ful4dzk7rkijcwu@queper01-lin> <20181010095459.orw2gse75klpwosx@queper01-lin> In-Reply-To: <20181010095459.orw2gse75klpwosx@queper01-lin> From: Vincent Guittot Date: Wed, 10 Oct 2018 12:14:32 +0200 Message-ID: Subject: Re: [RFC PATCH 0/7] Introduce thermal pressure To: Quentin Perret Cc: Ingo Molnar , Thara Gopinath , linux-kernel , Ingo Molnar , Peter Zijlstra , Zhang Rui , "gregkh@linuxfoundation.org" , "Rafael J. Wysocki" , Amit Kachhap , viresh kumar , Javi Merino , Eduardo Valentin , Daniel Lezcano , "open list:THERMAL" , Ionela Voinescu Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 10 Oct 2018 at 11:55, Quentin Perret wrote: > > Hi Vincent, > > On Wednesday 10 Oct 2018 at 10:50:05 (+0200), Vincent Guittot wrote: > > The problem with reflecting directly the capping is that it happens > > far more often than the pace at which cpu_capacity_orig is updated in > > the scheduler. > > Hmm, how can you be so sure ? That most likely depends on the workload, > the platform and the thermal governor. Some platforms heat up slowly, > some quickly. The pace at which the thermal governor will change things > should depend on that I assume. > > > This means that at the moment when scheduler uses the > > value, it might not be correct anymore. > > And OTOH, when you remove a cap for example, it will take time before > the scheduler can see the newly available capacity if you need to wait > for the signal to decay. So you are using a wrong information too in > that scenario. But we stay consistant with all other metrics. The same happen when a task decide to stay idle for a long time after some activity... You will have to wait for the signal to decay > > > Then, this value are also used > > when building the sched_domain and setting max_cpu_capacity which > > would also implies the rebuilt the sched_domain topology ... > > Wait what ? I thought the thermal cap was reflected in capacity_of, not > capacity_orig_of ... You need to rebuild the sched_domain in case of > thermal pressure ? > > Hmm, let me have a closer look at the patches, I must have missed > something ... > > > The pace of changing the capping is to fast to reflect that in the > > whole scheduler topology > > That's probably true in some cases, but it'd be cool to have numbers to > back up that statement, I think. > > Now, if you do need to rebuild the sched domain topology every time you > update the thermal pressure, I think the PELT HL is _way_ too short for > that ... You can't rebuild the whole thing every 32ms or so. Or am I > misunderstanding something ? > > > > Thara, have you tried to experiment with a simpler implementation as > > > suggested by Ingo ? > > > > > > Also, assuming that we do want to average things, do we actually want to > > > tie the thermal ramp-up time to the PELT half life ? That provides > > > nice maths properties wrt the other signals, but it's not obvious to me > > > that this thermal 'constant' should be the same on all platforms. Or > > > maybe it should ? > > > > The main interest of using PELT signal is that thermal pressure will > > evolve at the same pace as other signals used in the scheduler. > > Right, I think this is a nice property too (assuming that we actually > want to average things out). > > > With > > thermal pressure, we have the exact same problem as with RT tasks. The > > thermal will cap the max frequency which will cap the utilization of > > the tasks running on the CPU > > Well, the nature of the signal is slightly different IMO. Yes it's > capacity, but you're no actually measuring time spent on the CPU. All > other PELT signals are based on time, this thermal thing isn't, so it is > kinda different in a way. And I'm still wondering if it could be helpful hmmm ... it is based on time too. Both signals (current ones and thermal one) are really close. The main difference with other utilization signal is that instead of providing a running/not running boolean that is then weighted by the current capacity, the signal uses direclty the capped max capacity to reflect the amount of cycle that is stolen by thermal mitigation. > to be able to have a different HL for that thermal signal. That would > 'break' the nice maths properties we have, yes, but is it a problem or is > it actually helpful to cope with the thermal characteristics of > different platforms ? If you don't use the sign kind of signal with the same responsiveness, you will start to see some OPP toggles as an example when the thermal state change because one metrics will change faster than the other one and you can't have a correct view of the system. Same problem was happening with rt task. I take the example of RT task because it quite similar in the effect except that RT task steal all cycles when it runs whereas thermal mitigation steal only some by capping the frequency > > Thanks, > Quentin