Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp3309110imm; Mon, 6 Aug 2018 02:23:29 -0700 (PDT) X-Google-Smtp-Source: AAOMgpc4jSN+HsIIafykaWU1i35PaOJtUxUVHfyoQ80K2Uzo7rgfknhVlETHVL11/n1nqKOz0+6w X-Received: by 2002:a65:538e:: with SMTP id x14-v6mr13604794pgq.388.1533547409106; Mon, 06 Aug 2018 02:23:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533547409; cv=none; d=google.com; s=arc-20160816; b=qwjGpoIpxzoUm2KQ6Czx6xiTm1b+kDkjpSulJw6dyldYQgtfxcZ9t11E4bZS0Zin+4 tgT0c2UAy4gAe3e9eUByitV/wf1y84x9h95YyIhuFF3lY155CXtRoNacsax4HevkJ2Dc yJMVBPrMt/68Xx1FdVY1mEdHlNJdrFlaLbuxN1FVMOHoDCrAq5yHr95FAf0TToJh9ACn TxYV6xHfT5qTe273nmi+WE0quSIj/OohpaYW0jUGpmDlKnt/MvBvqRSCx1JKEV1FTi+4 BD6mMJcjwYBQW55dBH4Qf03OcDwweRz/TNwCSlafNn5Ymk47Ak56VnNFGKxP4SsmcOHV 1fcA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=rBVMgdFayM8gfo0jLfieFja7xecShzptmiNNZ+f44BE=; b=oNDr/pavZDgXtBH1QekQSfH1V9ObXrAOmsfFfaOflE6/0GYGTN9wSafR06P8nVtt+9 XX8HDl3dH85kNy3LthGsdshdksffT7l+hsRpH2mIuh2IJ5QMuOU+b2FoRB+61KExRHEr Xu9+GeEU6naUDcG8VB4YmDxcpHQEEV2TGlyUGTVR0Mg/vlmGOHdd9/KDC0gdSBInSaBS UP3e5Tv+vUteXQZzqJM6ngDXUtRlyG9mWc2xtL7KGZrbetg53lRGAfGhlu4N9avqAkqa 0p7U4ghuVyUDJs1KLBNqtvFeLSbsMIenq3RBNO1nXwA/jyYwvN3jHzHYsXvmBYhvXmK9 Pq4Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b="i/cv5Iit"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l190-v6si12351285pgd.626.2018.08.06.02.23.14; Mon, 06 Aug 2018 02:23:29 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b="i/cv5Iit"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729861AbeHFL3K (ORCPT + 99 others); Mon, 6 Aug 2018 07:29:10 -0400 Received: from mail-oi0-f67.google.com ([209.85.218.67]:34636 "EHLO mail-oi0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726969AbeHFL3K (ORCPT ); Mon, 6 Aug 2018 07:29:10 -0400 Received: by mail-oi0-f67.google.com with SMTP id 13-v6so20989523ois.1; Mon, 06 Aug 2018 02:21:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=rBVMgdFayM8gfo0jLfieFja7xecShzptmiNNZ+f44BE=; b=i/cv5IitxBzl4lSYHyIMWDUYlf2ZKCx+lqQLkxdQOPvrMer6NFCUg7YxLGvHaJYtqL xECvAteey39Mpnpyh7nc9Y3ZfpGqdhi9MBOfsD5aIPlft37IBvl6AA4GDkohI6qc6Shv c5tYo0qYTR8PNcjeQuLqT3Q5ReYeRoJQoSDqz/hprz49V0dF2HeHK1fKiOrCvyQAg/3L nPE6JI2oMX+pYyCcM9mwLOyOwPiRLwPizszJ1NNJyrHhh4GPrZUva/WbbW3XiUwO9AlV rfRRSH0UaHb81lWWlA0kfBlVxGkadcNg8UWT69PitDxrglyFswpTfp8jTztC8WSGzT9r +fjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=rBVMgdFayM8gfo0jLfieFja7xecShzptmiNNZ+f44BE=; b=TkUN9DO0cu1VKh+m/7IcsAgkMNcfWmZJlxU6657+MRoPU7F/Qu3qltw7TNWcaguM4Z e3A8JP23BGw5o5vKeyDJrwhPsf3QyYny4PSI0+YYMkqCI3OytyMHDg6eAhDTQe8hFa8P PqYdFBJQRvGq9weBRTV4BZvVeWcjc8BWhuyntUNxgS5XPQsVtiSkp8XqYw6CPXngX0an Nt6LrX5qC3GlKwdOHa0Trzbxf3q231F2BzcPy28kk/LniPkcyvsIGuxKx3ZzNJYu/Alv AL35ls8WadCzuBUCKd321t9s6Op8AjOaEYtTCpkfq1o00VPoKNlNmRulRXr9ujV9pKrU RIrg== X-Gm-Message-State: AOUpUlHtirIk9/LMzWU9Fc2HoJmuofYjl3sf4z8egAu0FTyZjcw+9KOP IIGowvjl2Hy6k3ZCSGfZfXHzw5SC/T2WZYu/lxk= X-Received: by 2002:aca:b841:: with SMTP id i62-v6mr13186164oif.358.1533547259919; Mon, 06 Aug 2018 02:20:59 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a9d:63d2:0:0:0:0:0 with HTTP; Mon, 6 Aug 2018 02:20:59 -0700 (PDT) In-Reply-To: References: <20180620172226.15012-1-ulf.hansson@linaro.org> <20180620172226.15012-8-ulf.hansson@linaro.org> <3574880.GjmnMm1lMq@aspire.rjw.lan> <10360149.m4MlxDWZY5@aspire.rjw.lan> From: "Rafael J. Wysocki" Date: Mon, 6 Aug 2018 11:20:59 +0200 X-Google-Sender-Auth: Unj4EQYNitbS7q-ScCDTtle4iv8 Message-ID: Subject: Re: [PATCH v8 07/26] PM / Domains: Add genpd governor for CPUs To: Ulf Hansson Cc: "Rafael J. Wysocki" , Sudeep Holla , Lorenzo Pieralisi , Mark Rutland , Linux PM , Kevin Hilman , Lina Iyer , Lina Iyer , Rob Herring , Daniel Lezcano , Thomas Gleixner , Vincent Guittot , Stephen Boyd , Juri Lelli , Geert Uytterhoeven , Linux ARM , linux-arm-msm , Linux Kernel Mailing List , Frederic Weisbecker , Ingo Molnar Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Aug 3, 2018 at 4:28 PM, Ulf Hansson wrote: > On 26 July 2018 at 11:14, Rafael J. Wysocki wrote: >> On Thursday, July 19, 2018 12:32:52 PM CEST Rafael J. Wysocki wrote: >>> On Wednesday, June 20, 2018 7:22:07 PM CEST Ulf Hansson wrote: >>> > As it's now perfectly possible that a PM domain managed by genpd contains >>> > devices belonging to CPUs, we should start to take into account the >>> > residency values for the idle states during the state selection process. >>> > The residency value specifies the minimum duration of time, the CPU or a >>> > group of CPUs, needs to spend in an idle state to not waste energy entering >>> > it. >>> > >>> > To deal with this, let's add a new genpd governor, pm_domain_cpu_gov, that >>> > may be used for a PM domain that have CPU devices attached or if the CPUs >>> > are attached through subdomains. >>> > >>> > The new governor computes the minimum expected idle duration time for the >>> > online CPUs being attached to the PM domain and its subdomains. Then in the >>> > state selection process, trying the deepest state first, it verifies that >>> > the idle duration time satisfies the state's residency value. >>> > >>> > It should be noted that, when computing the minimum expected idle duration >>> > time, we use the information from tick_nohz_get_next_wakeup(), to find the >>> > next wakeup for the related CPUs. Future wise, this may deserve to be >>> > improved, as there are more reasons to why a CPU may be woken up from idle. >>> > >>> > Cc: Thomas Gleixner >>> > Cc: Daniel Lezcano >>> > Cc: Lina Iyer >>> > Cc: Frederic Weisbecker >>> > Cc: Ingo Molnar >>> > Co-developed-by: Lina Iyer >>> > Signed-off-by: Ulf Hansson >>> > --- >>> > drivers/base/power/domain_governor.c | 58 ++++++++++++++++++++++++++++ >>> > include/linux/pm_domain.h | 2 + >>> > 2 files changed, 60 insertions(+) >>> > >>> > diff --git a/drivers/base/power/domain_governor.c b/drivers/base/power/domain_governor.c >>> > index 99896fbf18e4..1aad55719537 100644 >>> > --- a/drivers/base/power/domain_governor.c >>> > +++ b/drivers/base/power/domain_governor.c >>> > @@ -10,6 +10,9 @@ >>> > #include >>> > #include >>> > #include >>> > +#include >>> > +#include >>> > +#include >>> > >>> > static int dev_update_qos_constraint(struct device *dev, void *data) >>> > { >>> > @@ -245,6 +248,56 @@ static bool always_on_power_down_ok(struct dev_pm_domain *domain) >>> > return false; >>> > } >>> > >>> > +static bool cpu_power_down_ok(struct dev_pm_domain *pd) >>> > +{ >>> > + struct generic_pm_domain *genpd = pd_to_genpd(pd); >>> > + ktime_t domain_wakeup, cpu_wakeup; >>> > + s64 idle_duration_ns; >>> > + int cpu, i; >>> > + >>> > + if (!(genpd->flags & GENPD_FLAG_CPU_DOMAIN)) >>> > + return true; >>> > + >>> > + /* >>> > + * Find the next wakeup for any of the online CPUs within the PM domain >>> > + * and its subdomains. Note, we only need the genpd->cpus, as it already >>> > + * contains a mask of all CPUs from subdomains. >>> > + */ >>> > + domain_wakeup = ktime_set(KTIME_SEC_MAX, 0); >>> > + for_each_cpu_and(cpu, genpd->cpus, cpu_online_mask) { >>> > + cpu_wakeup = tick_nohz_get_next_wakeup(cpu); >>> > + if (ktime_before(cpu_wakeup, domain_wakeup)) >>> > + domain_wakeup = cpu_wakeup; >>> > + } >> >> Here's a concern I have missed before. :-/ >> >> Say, one of the CPUs you're walking here is woken up in the meantime. > > Yes, that can happen - when we miss-predicted "next wakeup". > >> >> I don't think it is valid to evaluate tick_nohz_get_next_wakeup() for it then >> to update domain_wakeup. We really should just avoid the domain power off in >> that case at all IMO. > > Correct. > > However, we also want to avoid locking contentions in the idle path, > which is what this boils done to. This already is done under genpd_lock() AFAICS, so I'm not quite sure what exactly you mean. Besides, this is not just about increased latency, which is a concern by itself but maybe not so much in all environments, but also about possibility of missing a CPU wakeup, which is a major issue. If one of the CPUs sharing the domain with the current one is woken up during cpu_power_down_ok() and the wakeup is an edge-triggered interrupt and the domain is turned off regardless, the wakeup may be missed entirely if I'm not mistaken. It looks like there needs to be a way for the hardware to prevent a domain poweroff when there's a pending interrupt or I don't quite see how this can be handled correctly. >> Sure enough, if the domain power off is already started and one of the CPUs >> in the domain is woken up then, too bad, it will suffer the latency (but in >> that case the hardware should be able to help somewhat), but otherwise CPU >> wakeup should prevent domain power off from being carried out. > > The CPU is not prevented from waking up, as we rely on the FW to deal with that. > > Even if the above computation turns out to wrongly suggest that the > cluster can be powered off, the FW shall together with the genpd > backend driver prevent it. Fine, but then the solution depends on specific FW/HW behavior, so I'm not sure how generic it really is. At least, that expectation should be clearly documented somewhere, preferably in code comments. > To cover this case for PSCI, we also use a per cpu variable for the > CPU's power off state, as can be seen later in the series. Oh great, but the generic part should be independent on the underlying implementation of the driver. If it isn't, then it also is not generic. > Hope this clarifies your concern, else tell and will to elaborate a bit more. Not really. There also is one more problem and that is the interaction between this code and the idle governor. Namely, the idle governor may select a shallower state for some reason, for example due to an additional latency limit derived from CPU utilization (like in the menu governor), and how does the code in cpu_power_down_ok() know what state has been selected and how does it honor the selection made by the idle governor?