Received: by 2002:a5b:505:0:0:0:0:0 with SMTP id o5csp4752450ybp; Mon, 14 Oct 2019 09:28:22 -0700 (PDT) X-Google-Smtp-Source: APXvYqxR2BgO4/17q6zcL+55oNy3bXDh9rBb5ZrvDXRpOhHlAatznEl0R9kyJIoQheq6z66jFqIP X-Received: by 2002:aa7:d584:: with SMTP id r4mr28520334edq.92.1571070502722; Mon, 14 Oct 2019 09:28:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1571070502; cv=none; d=google.com; s=arc-20160816; b=BqWE76mhqrqCWvYtLNhb/d5Tk96MXa3pnjeObW0Mt41uPK1Vu3SMpQNMDjDixZxaiu pNiJlL26KgrpU30zT6/OZKflnqN5QMmGLzrFdWMYB5NdStFL8XuC1M15vM5Ld2ZLSYGY ZPpIa1hNt9ZFwMin5dIQZkKeEFpVl09rt8VwauAwmRnv7vic2/G4kVpKIndpjk80j//q p3TjAbYJbNvNF/MbASRYAahzUbzNJgkkvxc94cMl0K5tMXkcI0PA1M5Y/VILn0MY8cIi zbp8rewvkGvBOzmAs0nnRL0+bHqtdMAlq51o4+u6gUdiJ9jGzQ4f4XtyLXKpcZZi3Ai1 jo9A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=HhOsQy6UHZi94Qo2wix25ACjhisSG2e6KieU8k6VH3Q=; b=SlwMQi2ogk5E6UfoT2zcETyYxyZqC4/qWx3z2/lmZDTwKOBEawRcK/Y0aQnCfKjMOn xC17HiUsoPNdG9PgZNBrIImvftmcbBV0atUFEAIs+1AruLIGukmxZn8Skddv2Z4f9KBK tp6wSDURW8dHMF9lBorwT6ldWuSQMIh85C+gSCvCRgh0+emVlmPxw/OZzsUMXcJYIj08 ahhRRTbea9Z7R6Xkr4SqGQ+nhidBseA+BmdmK+rbvYkPtLWOadorXMXVh2S5UYHDVxjn nRJhoRn9r/JYUCu7mhYDmkKS6mCyTEhRMDXFyT5bGRxV9f1XeeyxAl/ghM6itXYuBzW1 wsmw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=C0zDPBy8; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e23si11242738ejt.379.2019.10.14.09.27.59; Mon, 14 Oct 2019 09:28:22 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=C0zDPBy8; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730093AbfJNPuW (ORCPT + 99 others); Mon, 14 Oct 2019 11:50:22 -0400 Received: from mail-lj1-f196.google.com ([209.85.208.196]:38629 "EHLO mail-lj1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730000AbfJNPuW (ORCPT ); Mon, 14 Oct 2019 11:50:22 -0400 Received: by mail-lj1-f196.google.com with SMTP id b20so17154017ljj.5 for ; Mon, 14 Oct 2019 08:50:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=HhOsQy6UHZi94Qo2wix25ACjhisSG2e6KieU8k6VH3Q=; b=C0zDPBy8mKMNrihW6Sq0nwwwQIZChvyeeM+Dbnsl3m0ERXia0RgQ2VralaYgTsV2rb XO/figIHm4DV1I7V51KkKqqXU5zNAhSKtVFBKblzH+AQbjgHYo/hmCq4Q00DRhgkX301 B/xIl+49oYAiJZa73xrXGMqPks7u4JYH3Vn5d6maOgyZArcIUGECCuW2B2lOAC4wXwIb VEJMbW7wmlMB3WX6NVdXOgI0E/CzJ6MMU2XsfeGHlYLb3JixQ5StsB5MqPuBMpJGwZPl kaV0VDMwsM34T3qi1jWJWTOYLLhwy7CZciEDHbr4NHIBqU/ftO8ZT0Sph4tF7yYLDKzm IAFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=HhOsQy6UHZi94Qo2wix25ACjhisSG2e6KieU8k6VH3Q=; b=cU0XLWIHBpQDleMTAEUVMo5a6YH8zIcT++i7fV3gnrFX70h0rzCXNE8qgvjpn6nNoN eX8h4Op/jLOqZuJWJw9f7WGjohtAPUep7c7eDrlYXp7CSyqp1G39s/0mLaYwGtIQEpaV YQ1KaRkFtW3GbrOKnKt0zypH/P4CoXr1X+3wBv3pGptKUPhB21XYlivYU+bZB19XAcI9 qrYarrYuhKE5w6Tz09uRlXm2IddeJelVClXJhjwbWt+Krg6RlEbdP5bYMMsxWQEQnkdh 7YYOyg/uXtlJufnqQpRhXOswHq3q8Cm3sKu2ZQuoHNrNDh8S6kuC1+ZJmhVV6iNoXIIA 9TQg== X-Gm-Message-State: APjAAAWvM8QpVkU5plTN5QAP5tuja6ysmEFfLehBOYT11ilvtHoOdFjb LeAjW9MIlUNrOh9Da6JTUkFDNPEfNMiF7/RdX+ztpw== X-Received: by 2002:a05:651c:112e:: with SMTP id e14mr18845994ljo.193.1571068217836; Mon, 14 Oct 2019 08:50:17 -0700 (PDT) MIME-Version: 1.0 References: <1571014705-19646-1-git-send-email-thara.gopinath@linaro.org> <1571014705-19646-3-git-send-email-thara.gopinath@linaro.org> In-Reply-To: <1571014705-19646-3-git-send-email-thara.gopinath@linaro.org> From: Vincent Guittot Date: Mon, 14 Oct 2019 17:50:06 +0200 Message-ID: Subject: Re: [Patch v3 2/7] sched: Add infrastructure to store and update instantaneous thermal pressure To: Thara Gopinath Cc: Ingo Molnar , Peter Zijlstra , Ionela Voinescu , Zhang Rui , Eduardo Valentin , linux-kernel , Amit Kachhap , Javi Merino , Daniel Lezcano Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Thara, On Mon, 14 Oct 2019 at 02:58, Thara Gopinath wrote: > > Add thermal.c and thermal.h files that provides interface > APIs to initialize, update/average, track, accumulate and decay > thermal pressure per cpu basis. A per cpu structure max_capacity_info is > introduced to keep track of instantaneous per cpu thermal pressure. > Thermal pressure is the delta between max_capacity and cap_capacity. > API update_periodic_maxcap is called for periodic accumulate and decay > of the thermal pressure. It is to to be called from a periodic tick > function. This API calculates the delta between max_capacity and > cap_capacity and passes on the delta to update_thermal_avg to do the > necessary accumulate, decay and average. API update_maxcap_capacity is for > the system to update the thermal pressure by updating cap_capacity. > Considering, update_periodic_maxcap reads cap_capacity and > update_maxcap_capacity writes into cap_capacity, one can argue for > some sort of locking mechanism to avoid a stale value. > But considering update_periodic_maxcap can be called from a system > critical path like scheduler tick function, a locking mechanism is not > ideal. This means that it is possible the value used to > calculate average thermal pressure for a cpu can be stale for upto 1 > tick period. > > Signed-off-by: Thara Gopinath > --- > include/linux/sched.h | 14 +++++++++++ > kernel/sched/Makefile | 2 +- > kernel/sched/thermal.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++ > kernel/sched/thermal.h | 13 ++++++++++ > 4 files changed, 94 insertions(+), 1 deletion(-) > create mode 100644 kernel/sched/thermal.c > create mode 100644 kernel/sched/thermal.h > > diff --git a/include/linux/sched.h b/include/linux/sched.h > index 2c2e56b..875ce2b 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -1983,6 +1983,20 @@ static inline void rseq_syscall(struct pt_regs *regs) > > #endif > > +#ifdef CONFIG_SMP > +void update_maxcap_capacity(int cpu, u64 capacity); > + > +void populate_max_capacity_info(void); > +#else > +static inline void update_maxcap_capacity(int cpu, u64 capacity) > +{ > +} > + > +static inline void populate_max_capacity_info(void) > +{ > +} > +#endif > + > const struct sched_avg *sched_trace_cfs_rq_avg(struct cfs_rq *cfs_rq); > char *sched_trace_cfs_rq_path(struct cfs_rq *cfs_rq, char *str, int len); > int sched_trace_cfs_rq_cpu(struct cfs_rq *cfs_rq); > diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile > index 21fb5a5..4d3b820 100644 > --- a/kernel/sched/Makefile > +++ b/kernel/sched/Makefile > @@ -20,7 +20,7 @@ obj-y += core.o loadavg.o clock.o cputime.o > obj-y += idle.o fair.o rt.o deadline.o > obj-y += wait.o wait_bit.o swait.o completion.o > > -obj-$(CONFIG_SMP) += cpupri.o cpudeadline.o topology.o stop_task.o pelt.o > +obj-$(CONFIG_SMP) += cpupri.o cpudeadline.o topology.o stop_task.o pelt.o thermal.o > obj-$(CONFIG_SCHED_AUTOGROUP) += autogroup.o > obj-$(CONFIG_SCHEDSTATS) += stats.o > obj-$(CONFIG_SCHED_DEBUG) += debug.o > diff --git a/kernel/sched/thermal.c b/kernel/sched/thermal.c > new file mode 100644 > index 0000000..5f0b2d4 > --- /dev/null > +++ b/kernel/sched/thermal.c > @@ -0,0 +1,66 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Sceduler Thermal Interactions > + * > + * Copyright (C) 2018 Linaro, Inc., Thara Gopinath > + */ > + > +#include > +#include "sched.h" > +#include "pelt.h" > +#include "thermal.h" > + > +struct max_capacity_info { > + unsigned long max_capacity; > + unsigned long cap_capacity; > +}; > + > +static DEFINE_PER_CPU(struct max_capacity_info, max_cap); > + > +void update_maxcap_capacity(int cpu, u64 capacity) > +{ > + struct max_capacity_info *__max_cap; > + unsigned long __capacity; > + > + __max_cap = (&per_cpu(max_cap, cpu)); > + if (!__max_cap) { > + pr_err("no max_capacity_info structure for cpu %d\n", cpu); > + return; > + } > + > + /* Normalize the capacity */ > + __capacity = (capacity * arch_scale_cpu_capacity(cpu)) >> > + SCHED_CAPACITY_SHIFT; > + pr_debug("updating cpu%d capped capacity from %lu to %lu\n", cpu, __max_cap->cap_capacity, __capacity); > + > + __max_cap->cap_capacity = __capacity; > +} > + > +void populate_max_capacity_info(void) > +{ > + struct max_capacity_info *__max_cap; > + u64 capacity; > + int cpu; > + > + for_each_possible_cpu(cpu) { > + __max_cap = (&per_cpu(max_cap, cpu)); > + if (!__max_cap) > + continue; > + capacity = arch_scale_cpu_capacity(cpu); > + __max_cap->max_capacity = capacity; > + __max_cap->cap_capacity = capacity; > + pr_debug("cpu %d max capacity set to %ld\n", cpu, __max_cap->max_capacity); > + } > +} everything above seems to be there for the cpu cooling device and should be included in it instead. The scheduler only need the capacity capping The cpu cooling device should just set the delta capacity in the per-cpu variable (see my comment below) > + > +void update_periodic_maxcap(struct rq *rq) > +{ > + struct max_capacity_info *__max_cap = (&per_cpu(max_cap, cpu_of(rq))); > + unsigned long delta; > + > + if (!__max_cap) > + return; > + > + delta = __max_cap->max_capacity - __max_cap->cap_capacity; Why don't you just save the delta in the per_cpu variable instead of the struct max_capacity_info ? You have to compute the delta every tick whereas we can expect it to not change that much compared to the number of tick > + update_thermal_avg(rq_clock_task(rq), rq, delta); > +} > diff --git a/kernel/sched/thermal.h b/kernel/sched/thermal.h > new file mode 100644 > index 0000000..5657cb4 > --- /dev/null > +++ b/kernel/sched/thermal.h > @@ -0,0 +1,13 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * Scheduler thermal interaction internal methods. > + */ > + > +#ifdef CONFIG_SMP > +void update_periodic_maxcap(struct rq *rq); > + > +#else > +static inline void update_periodic_maxcap(struct rq *rq) > +{ > +} > +#endif > -- > 2.1.4 >