Received: by 2002:a25:31c3:0:0:0:0:0 with SMTP id x186csp654383ybx; Fri, 1 Nov 2019 09:08:43 -0700 (PDT) X-Google-Smtp-Source: APXvYqyv8Lf5h4Jcx6BZL3ypqxEpZgwpm1wAc8fu51KueDPeCwYBMLahPJiHx7FKkCfkbtB3eKJz X-Received: by 2002:a50:f306:: with SMTP id p6mr13771114edm.284.1572624522919; Fri, 01 Nov 2019 09:08:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1572624522; cv=none; d=google.com; s=arc-20160816; b=yFHZnhqMb/rMX6gikfVLlLLws1F8j0dI4LclZ8XO/tQsjlCBEOF7d0WGqkVNqiT4Rb /Puc41CJe4CImz4La8lFU1wnic4E0j3toWw85hoLCEIPl46GYeK3WgKEiYQGshSMovct U4tmnd7hu8PxhwJZVAjUHT4GHFjeKKr0yTgxaLXJWKD5zYHBliTbluTm9p2Gnrxr/6lr mXg7iwqfdlOu/hhsXBgk1Y3EQamUjupsW0EP6+Qf1fWMcIuxM5FEWe5gua9pgUAUGmiL WYSNZkfDYHnidduRLlzDYi9SQGjaze155jB4oQLirmnnadn5ENn1d7uI0tHgV5E17ljx kGCA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=zauXZj4Rv7XiyRtkZboApJFqyaqsF05g2/NE1YQUKMg=; b=ZsUHUNDyWxefa9hjZV2YbzifXnO0wIT7BFF7BjBHubLoPprEHPeDfE4K8VK2ASzmUs cXgZrSUqPTiGgmk0+ASWjpJN1OGVjYtox3HPbvf/rwRqxAnYbZ3Tw0fZs5aG7M17YXQ3 OqZPFbvk1HKPneobu4fH9GRucdrG3XgnrOujBcDWkLeFreRoZvu0gjuwSjItT7XielWj 5ztAN6k/yCcZToPiAIgPCrVBB3RYlP17dOkwwKhZn+CmE3NejhrFt3kwn7u0HSKssl2z 3yDc1jAb8W7cMZ+sROKDw4XyNnW+7i5ru1KqYNcUtkrgtQip+sQzkMVGI9c5ejlluhBM DsJQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x25si5932381eju.357.2019.11.01.09.08.18; Fri, 01 Nov 2019 09:08:42 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728978AbfKAPrX (ORCPT + 99 others); Fri, 1 Nov 2019 11:47:23 -0400 Received: from foss.arm.com ([217.140.110.172]:37726 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726720AbfKAPrX (ORCPT ); Fri, 1 Nov 2019 11:47:23 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5C153328; Fri, 1 Nov 2019 08:47:22 -0700 (PDT) Received: from localhost (e108754-lin.cambridge.arm.com [10.1.199.68]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id F01CB3F719; Fri, 1 Nov 2019 08:47:21 -0700 (PDT) Date: Fri, 1 Nov 2019 15:47:20 +0000 From: Ionela Voinescu To: Vincent Guittot Cc: Dietmar Eggemann , Thara Gopinath , Ingo Molnar , Peter Zijlstra , Zhang Rui , Eduardo Valentin , Quentin Perret , linux-kernel , Amit Kachhap , Javi Merino , Daniel Lezcano Subject: Re: [Patch v4 5/6] thermal/cpu-cooling: Update thermal pressure in case of a maximum frequency capping Message-ID: <20191101154612.GA4884@e108754-lin> References: <1571776465-29763-1-git-send-email-thara.gopinath@linaro.org> <1571776465-29763-6-git-send-email-thara.gopinath@linaro.org> <2b19d7da-412c-932f-7251-110eadbef3e3@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi guys, On Thursday 31 Oct 2019 at 17:38:25 (+0100), Vincent Guittot wrote: > On Thu, 31 Oct 2019 at 17:29, Dietmar Eggemann wrote: > > > > On 22.10.19 22:34, Thara Gopinath wrote: > > > Thermal governors can request for a cpu's maximum supported frequency > > > to be capped in case of an overheat event. This in turn means that the > > > maximum capacity available for tasks to run on the particular cpu is > > > reduced. Delta between the original maximum capacity and capped > > > maximum capacity is known as thermal pressure. Enable cpufreq cooling > > > device to update the thermal pressure in event of a capped > > > maximum frequency. > > > > > > Signed-off-by: Thara Gopinath > > > --- > > > drivers/thermal/cpu_cooling.c | 31 +++++++++++++++++++++++++++++-- > > > 1 file changed, 29 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c > > > index 391f397..2e6a979 100644 > > > --- a/drivers/thermal/cpu_cooling.c > > > +++ b/drivers/thermal/cpu_cooling.c > > > @@ -218,6 +218,23 @@ static u32 cpu_power_to_freq(struct cpufreq_cooling_device *cpufreq_cdev, > > > } > > > > > > /** > > > + * update_sched_max_capacity - update scheduler about change in cpu > > > + * max frequency. > > > + * @policy - cpufreq policy whose max frequency is capped. > > > + */ > > > +static void update_sched_max_capacity(struct cpumask *cpus, > > > + unsigned int cur_max_freq, > > > + unsigned int max_freq) > > > +{ > > > + int cpu; > > > + unsigned long capacity = (cur_max_freq << SCHED_CAPACITY_SHIFT) / > > > + max_freq; > > > + > > > + for_each_cpu(cpu, cpus) > > > + update_thermal_pressure(cpu, capacity); > > > +} > > > + > > > +/** > > > * get_load() - get load for a cpu since last updated > > > * @cpufreq_cdev: &struct cpufreq_cooling_device for this cpu > > > * @cpu: cpu number > > > @@ -320,6 +337,7 @@ static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev, > > > unsigned long state) > > > { > > > struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata; > > > + int ret; > > > > > > /* Request state should be less than max_level */ > > > if (WARN_ON(state > cpufreq_cdev->max_level)) > > > @@ -331,8 +349,17 @@ static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev, > > > > > > cpufreq_cdev->cpufreq_state = state; > > > > > > - return dev_pm_qos_update_request(&cpufreq_cdev->qos_req, > > > - cpufreq_cdev->freq_table[state].frequency); > > > + ret = dev_pm_qos_update_request > > > + (&cpufreq_cdev->qos_req, > > > + cpufreq_cdev->freq_table[state].frequency); > > > + > > > + if (ret > 0) > > > + update_sched_max_capacity > > > + (cpufreq_cdev->policy->cpus, > > > + cpufreq_cdev->freq_table[state].frequency, > > > + cpufreq_cdev->policy->cpuinfo.max_freq); > > > + > > > + return ret; > > > } > > > > > > /** > > > > > > > Why not getting rid of update_sched_max_capacity() entirely and call > > update_thermal_pressure() in cpu_cooling.c directly? Saves one level in > > the call chain and would mean less code for this feature. > > But you add complexity in update_thermal_pressure which now has to > deal with a cpumask and to compute some frequency ratio > IMHO, it's cleaner to keep update_thermal_pressure simple as it is now > How about removing update_thermal_pressure altogether and doing all the work in update_sched_max_capacity? That is, have update_sched_max_capacity compute the capped_freq_ratio, do the normalization, and set it per_cpu for all CPUs in the frequency domain. You'll save some calculations that you're now doing in update_thermal_pressure for each cpu and you avoid shifting back and forth. If you're doing so it would be worth renaming update_sched_max_capacity to something like update_sched_thermal_pressure. Thanks, Ionela. > > > > Just compile tested on arm64: > > > > diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c > > index 3211b4d3a899..bf36995013b0 100644 > > --- a/drivers/thermal/cpu_cooling.c > > +++ b/drivers/thermal/cpu_cooling.c > > @@ -217,23 +217,6 @@ static u32 cpu_power_to_freq(struct > > cpufreq_cooling_device *cpufreq_cdev, > > return freq_table[i - 1].frequency; > > } > > > > -/** > > - * update_sched_max_capacity - update scheduler about change in cpu > > - * max frequency. > > - * @policy - cpufreq policy whose max frequency is capped. > > - */ > > -static void update_sched_max_capacity(struct cpumask *cpus, > > - unsigned int cur_max_freq, > > - unsigned int max_freq) > > -{ > > - int cpu; > > - unsigned long capacity = (cur_max_freq << SCHED_CAPACITY_SHIFT) / > > - max_freq; > > - > > - for_each_cpu(cpu, cpus) > > - update_thermal_pressure(cpu, capacity); > > -} > > - > > /** > > * get_load() - get load for a cpu since last updated > > * @cpufreq_cdev: &struct cpufreq_cooling_device for this cpu > > @@ -353,7 +336,7 @@ static int cpufreq_set_cur_state(struct > > thermal_cooling_device *cdev, > > cpufreq_cdev->freq_table[state].frequency); > > > > if (ret > 0) > > - update_sched_max_capacity > > + update_thermal_pressure > > (cpufreq_cdev->policy->cpus, > > cpufreq_cdev->freq_table[state].frequency, > > cpufreq_cdev->policy->cpuinfo.max_freq); > > diff --git a/include/linux/sched.h b/include/linux/sched.h > > index 55dfe9634f67..5707813c7621 100644 > > --- a/include/linux/sched.h > > +++ b/include/linux/sched.h > > @@ -1985,9 +1985,9 @@ static inline void rseq_syscall(struct pt_regs *regs) > > #endif > > > > #ifdef CONFIG_SMP > > -void update_thermal_pressure(int cpu, u64 capacity); > > +void update_thermal_pressure(struct cpumask *cpus, unsigned int cur, > > unsigned int max); > > #else > > -static inline void update_thermal_pressure(int cpu, u64 capacity) > > +static inline void update_thermal_pressure(struct cpumask *cpus, > > unsigned int cur, unsigned int max); > > { > > } > > #endif > > diff --git a/kernel/sched/thermal.c b/kernel/sched/thermal.c > > index 0da31e12a5ff..691bdd79597a 100644 > > --- a/kernel/sched/thermal.c > > +++ b/kernel/sched/thermal.c > > @@ -43,17 +43,16 @@ static DEFINE_PER_CPU(unsigned long, delta_capacity); > > * the arch_scale_cpu_capacity and capped capacity is stored in per cpu > > * delta_capacity. > > */ > > -void update_thermal_pressure(int cpu, u64 capped_freq_ratio) > > +void update_thermal_pressure(struct cpumask *cpus, unsigned int cur, > > unsigned int max) > > { > > - unsigned long __capacity, delta; > > + int cpu; > > > > - /* Normalize the capped freq ratio */ > > - __capacity = (capped_freq_ratio * arch_scale_cpu_capacity(cpu)) >> > > - > > SCHED_CAPACITY_SHIFT; > > - delta = arch_scale_cpu_capacity(cpu) - __capacity; > > - pr_debug("updating cpu%d thermal pressure to %lu\n", cpu, delta); > > + for_each_cpu(cpu, cpus) { > > + unsigned long scale_cap = arch_scale_cpu_capacity(cpu); > > + unsigned long cur_cap = cur * scale_cap / max; > > > > - per_cpu(delta_capacity, cpu) = delta; > > + per_cpu(delta_capacity, cpu) = scale_cap - cur_cap; > > + } > > }