2019-12-12 04:14:28

by Thara Gopinath

[permalink] [raw]
Subject: [Patch v6 3/7] Add infrastructure to store and update instantaneous thermal pressure

Add architecture specific APIs to update and track thermal pressure on a
per cpu basis. A per cpu variable thermal_pressure is introduced to keep
track of instantaneous per cpu thermal pressure. Thermal pressure is the
delta between maximum capacity and capped capacity due to a thermal event.
capacity and capped capacity due to a thermal event.

topology_get_thermal_pressure can be hooked into the scheduler specified
arch_scale_thermal_capacity to retrieve instantaneius thermal pressure of
a cpu.

arch_set_thermal_presure can be used to update the thermal pressure by
providing a capped maximum capacity.

Considering topology_get_thermal_pressure reads thermal_pressure and
arch_set_thermal_pressure writes into thermal_pressure, one can argue for
some sort of locking mechanism to avoid a stale value. But considering
topology_get_thermal_pressure_average can be called from a system critical
path like scheduler tick function, a locking mechanism is not ideal. This
means that it is possible the thermal_pressure value used to calculate
average thermal pressure for a cpu can be stale for upto 1 tick period.

Signed-off-by: Thara Gopinath <[email protected]>
---

v3->v4:
- Dropped per cpu max_capacity_info struct and instead added a per
delta_capacity variable to store the delta between maximum
capacity and capped capacity. The delta is now calculated when
thermal pressure is updated and not every tick.
- Dropped populate_max_capacity_info api as only per cpu delta
capacity is stored.
- Renamed update_periodic_maxcap to
trigger_thermal_pressure_average and update_maxcap_capacity to
update_thermal_pressure.
v4->v5:
- As per Peter's review comments folded thermal.c into fair.c.
- As per Ionela's review comments revamped update_thermal_pressure
to take maximum available capacity as input instead of maximum
capped frequency ration.
v5->v6:
- As per review comments moved all the infrastructure to track
and retrieve instantaneous thermal pressure out of scheduler
to topology files.

arch/arm/include/asm/topology.h | 3 +++
arch/arm64/include/asm/topology.h | 3 +++
drivers/base/arch_topology.c | 13 +++++++++++++
include/linux/arch_topology.h | 11 +++++++++++
4 files changed, 30 insertions(+)

diff --git a/arch/arm/include/asm/topology.h b/arch/arm/include/asm/topology.h
index 8a0fae9..90b18c3 100644
--- a/arch/arm/include/asm/topology.h
+++ b/arch/arm/include/asm/topology.h
@@ -16,6 +16,9 @@
/* Enable topology flag updates */
#define arch_update_cpu_topology topology_update_cpu_topology

+/* Replace task scheduler's defalut thermal pressure retrieve API */
+#define arch_scale_thermal_capacity topology_get_thermal_pressure
+
#else

static inline void init_cpu_topology(void) { }
diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
index a4d945d..ccb277b 100644
--- a/arch/arm64/include/asm/topology.h
+++ b/arch/arm64/include/asm/topology.h
@@ -25,6 +25,9 @@ int pcibus_to_node(struct pci_bus *bus);
/* Enable topology flag updates */
#define arch_update_cpu_topology topology_update_cpu_topology

+/* Replace task scheduler's defalut thermal pressure retrieve API */
+#define arch_scale_thermal_capacity topology_get_thermal_pressure
+
#include <asm-generic/topology.h>

#endif /* _ASM_ARM_TOPOLOGY_H */
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 1eb81f11..3a91379 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -42,6 +42,19 @@ void topology_set_cpu_scale(unsigned int cpu, unsigned long capacity)
per_cpu(cpu_scale, cpu) = capacity;
}

+DEFINE_PER_CPU(unsigned long, thermal_pressure);
+
+void arch_set_thermal_pressure(struct cpumask *cpus,
+ unsigned long capped_capacity)
+{
+ int cpu;
+ unsigned long delta = arch_scale_cpu_capacity(cpumask_first(cpus)) -
+ capped_capacity;
+
+ for_each_cpu(cpu, cpus)
+ WRITE_ONCE(per_cpu(thermal_pressure, cpu), delta);
+}
+
static ssize_t cpu_capacity_show(struct device *dev,
struct device_attribute *attr,
char *buf)
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index 3015ecb..7a04364 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -33,6 +33,17 @@ unsigned long topology_get_freq_scale(int cpu)
return per_cpu(freq_scale, cpu);
}

+DECLARE_PER_CPU(unsigned long, thermal_pressure);
+
+static inline
+unsigned long topology_get_thermal_pressure(int cpu)
+{
+ return per_cpu(thermal_pressure, cpu);
+}
+
+void arch_set_thermal_pressure(struct cpumask *cpus,
+ unsigned long capped_capacity);
+
struct cpu_topology {
int thread_id;
int core_id;
--
2.1.4


2019-12-23 17:52:25

by Ionela Voinescu

[permalink] [raw]
Subject: Re: [Patch v6 3/7] Add infrastructure to store and update instantaneous thermal pressure

Hi Thara,

On Wednesday 11 Dec 2019 at 23:11:44 (-0500), Thara Gopinath wrote:
> Add architecture specific APIs to update and track thermal pressure on a
> per cpu basis. A per cpu variable thermal_pressure is introduced to keep
> track of instantaneous per cpu thermal pressure. Thermal pressure is the
> delta between maximum capacity and capped capacity due to a thermal event.
> capacity and capped capacity due to a thermal event.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This line seems to be a duplicate (initially I thought I was seeing
double :) ).

> topology_get_thermal_pressure can be hooked into the scheduler specified
> arch_scale_thermal_capacity to retrieve instantaneius thermal pressure of
> a cpu.
>
> arch_set_thermal_presure can be used to update the thermal pressure by
> providing a capped maximum capacity.
>
> Considering topology_get_thermal_pressure reads thermal_pressure and
> arch_set_thermal_pressure writes into thermal_pressure, one can argue for
> some sort of locking mechanism to avoid a stale value. But considering
> topology_get_thermal_pressure_average can be called from a system critical
> path like scheduler tick function, a locking mechanism is not ideal. This
> means that it is possible the thermal_pressure value used to calculate
> average thermal pressure for a cpu can be stale for upto 1 tick period.
>
> Signed-off-by: Thara Gopinath <[email protected]>
[...]
> diff --git a/arch/arm/include/asm/topology.h b/arch/arm/include/asm/topology.h
> index 8a0fae9..90b18c3 100644
> --- a/arch/arm/include/asm/topology.h
> +++ b/arch/arm/include/asm/topology.h
> @@ -16,6 +16,9 @@
> /* Enable topology flag updates */
> #define arch_update_cpu_topology topology_update_cpu_topology
>
> +/* Replace task scheduler's defalut thermal pressure retrieve API */

s/defalut/default

> +#define arch_scale_thermal_capacity topology_get_thermal_pressure
> +

I also think this is deserving of a better name. I would drop the
'scale' part as well as it is not used as a scale factor, as
freq_scale or cpu_scale, but it's used as a reduction in capacity
(thermal capacity pressure) due to a thermal event.

It might be too much but what do you think about:
arch_thermal_capacity_pressure?

> #else
>
> static inline void init_cpu_topology(void) { }
> diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
> index a4d945d..ccb277b 100644
> --- a/arch/arm64/include/asm/topology.h
> +++ b/arch/arm64/include/asm/topology.h
> @@ -25,6 +25,9 @@ int pcibus_to_node(struct pci_bus *bus);
> /* Enable topology flag updates */
> #define arch_update_cpu_topology topology_update_cpu_topology
>
> +/* Replace task scheduler's defalut thermal pressure retrieve API */

s/defalut/default

Regards,
Ionela.