2023-12-06 11:31:07

by Lukasz Luba

[permalink] [raw]
Subject: [PATCH 0/5] Add callback for cooling list update to speed-up IPA

Hi all,

The patch set a new callback for thermal governors and implementation for
Intelligent Power Allocator.

The goal is to move the memory allocations and heavy computations
(multiplications) out of throttle() callback hot path.

In addition there are some small clean-ups for IPA code.

Regards,
Lukasz

Lukasz Luba (5):
thermal: core: Add callback for governors with cooling instances
change
thermal: gov_power_allocator: Refactor check_power_actors()
thermal: gov_power_allocator: Move memory allocation out of throttle()
thermal: gov_power_allocator: Simplify checks for valid power actor
thermal: gov_power_allocator: Refactor checks in divvy_up_power()

drivers/thermal/gov_power_allocator.c | 202 +++++++++++++++++---------
drivers/thermal/thermal_core.c | 14 ++
include/linux/thermal.h | 4 +
3 files changed, 154 insertions(+), 66 deletions(-)

--
2.25.1


2023-12-06 11:31:13

by Lukasz Luba

[permalink] [raw]
Subject: [PATCH 1/5] thermal: core: Add callback for governors with cooling instances change

Allow governors to react to the changes in the cooling instances list.
That makes possible to move memory allocations related to the number of
cooling instances out of the throttle() callback. The throttle() callback
is called much more often thus the cost of managing allocations there is
an extra overhead. The list of cooling instances is not changed that often
and it can be handled in dedicated callback. That will save CPU cycles
in the throttle() code path. Both callbacks code paths are protected with
the same thermal zone lock, which guaranties the list of cooling instances
is consistent.

Signed-off-by: Lukasz Luba <[email protected]>
---
drivers/thermal/thermal_core.c | 14 ++++++++++++++
include/linux/thermal.h | 4 ++++
2 files changed, 18 insertions(+)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 625ba07cbe2f..c993b86f7fb5 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -314,6 +314,15 @@ static void handle_non_critical_trips(struct thermal_zone_device *tz,
def_governor->throttle(tz, trip);
}

+static void handle_instances_list_update(struct thermal_zone_device *tz)
+{
+
+ if (!tz->governor || !tz->governor->instances_update)
+ return;
+
+ tz->governor->instances_update(tz);
+}
+
void thermal_zone_device_critical(struct thermal_zone_device *tz)
{
/*
@@ -723,6 +732,8 @@ int thermal_bind_cdev_to_trip(struct thermal_zone_device *tz,
list_add_tail(&dev->tz_node, &tz->thermal_instances);
list_add_tail(&dev->cdev_node, &cdev->thermal_instances);
atomic_set(&tz->need_update, 1);
+
+ handle_instances_list_update(tz);
}
mutex_unlock(&cdev->lock);
mutex_unlock(&tz->lock);
@@ -781,6 +792,9 @@ int thermal_unbind_cdev_from_trip(struct thermal_zone_device *tz,
if (pos->tz == tz && pos->trip == trip && pos->cdev == cdev) {
list_del(&pos->tz_node);
list_del(&pos->cdev_node);
+
+ handle_instances_list_update(tz);
+
mutex_unlock(&cdev->lock);
mutex_unlock(&tz->lock);
goto unbind;
diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index c7190e2dfcb4..e7b2a1f4bab0 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -195,6 +195,9 @@ struct thermal_zone_device {
* thermal zone.
* @throttle: callback called for every trip point even if temperature is
* below the trip point temperature
+ * @instances_update: callback called when thermal zone instances list
+ * i has changed (e.g. added new or removed), which
+ * may help to offload work for governor like allocations
* @governor_list: node in thermal_governor_list (in thermal_core.c)
*/
struct thermal_governor {
@@ -203,6 +206,7 @@ struct thermal_governor {
void (*unbind_from_tz)(struct thermal_zone_device *tz);
int (*throttle)(struct thermal_zone_device *tz,
const struct thermal_trip *trip);
+ void (*instances_update)(struct thermal_zone_device *tz);
struct list_head governor_list;
};

--
2.25.1

2023-12-06 11:31:15

by Lukasz Luba

[permalink] [raw]
Subject: [PATCH 2/5] thermal: gov_power_allocator: Refactor check_power_actors()

Refactor check_power_actors() to make it possible for re-use in the
upcoming new callback.

No intentional functional impact.

Signed-off-by: Lukasz Luba <[email protected]>
---
drivers/thermal/gov_power_allocator.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c
index 785fff14223d..38e1e89ba10c 100644
--- a/drivers/thermal/gov_power_allocator.c
+++ b/drivers/thermal/gov_power_allocator.c
@@ -581,8 +581,9 @@ static void allow_maximum_power(struct thermal_zone_device *tz, bool update)
* power actor API. The warning should help to investigate the issue, which
* could be e.g. lack of Energy Model for a given device.
*
- * Return: 0 on success, -EINVAL if any cooling device does not implement
- * the power actor API.
+ * Return number of cooling devices or -EINVAL if any cooling device does not
+ * implement the power actor API. Return value 0 is also valid since cooling
+ * devices might be attached later.
*/
static int check_power_actors(struct thermal_zone_device *tz,
struct power_allocator_params *params)
@@ -597,8 +598,9 @@ static int check_power_actors(struct thermal_zone_device *tz,
if (!cdev_is_power_actor(instance->cdev)) {
dev_warn(&tz->device, "power_allocator: %s is not a power actor\n",
instance->cdev->type);
- ret = -EINVAL;
+ return -EINVAL;
}
+ ret++;
}

return ret;
@@ -631,7 +633,7 @@ static int power_allocator_bind(struct thermal_zone_device *tz)
}

ret = check_power_actors(tz, params);
- if (ret) {
+ if (ret < 0) {
dev_warn(&tz->device, "power_allocator: binding failed\n");
kfree(params);
return ret;
--
2.25.1

2023-12-06 11:31:22

by Lukasz Luba

[permalink] [raw]
Subject: [PATCH 3/5] thermal: gov_power_allocator: Move memory allocation out of throttle()

The new thermal callback allows to react to the change of cooling
instances in the thermal zone. Move the memory allocation to that new
callback and save CPU cycles in the throttle() code path.

Signed-off-by: Lukasz Luba <[email protected]>
---
drivers/thermal/gov_power_allocator.c | 137 ++++++++++++++++++++------
1 file changed, 106 insertions(+), 31 deletions(-)

diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c
index 38e1e89ba10c..1b55d00fc12b 100644
--- a/drivers/thermal/gov_power_allocator.c
+++ b/drivers/thermal/gov_power_allocator.c
@@ -61,6 +61,13 @@ static inline s64 div_frac(s64 x, s64 y)
* @trip_switch_on should be NULL.
* @trip_max: last passive trip point of the thermal zone. The
* temperature we are controlling for.
+ * @num_actors: number of cooling devices supporting IPA callbacks
+ * @buffer_size: IPA internal buffer size
+ * @req_power: IPA buffer for requested power
+ * @max_power: IPA buffer for max allocatable power
+ * @granted_power: IPA buffer for granted power
+ * @extra_actor_power: IPA buffer for extra power
+ * @weighted_req_power: IPA buffer for weighted requested power
*/
struct power_allocator_params {
bool allocated_tzp;
@@ -69,6 +76,13 @@ struct power_allocator_params {
u32 sustainable_power;
const struct thermal_trip *trip_switch_on;
const struct thermal_trip *trip_max;
+ int num_actors;
+ int buffer_size;
+ u32 *req_power;
+ u32 *max_power;
+ u32 *granted_power;
+ u32 *extra_actor_power;
+ u32 *weighted_req_power;
};

/**
@@ -387,39 +401,24 @@ static int allocate_power(struct thermal_zone_device *tz, int control_temp)
u32 *weighted_req_power;
u32 power_range, weight;
int total_weight = 0;
- int num_actors = 0;
int i = 0;

- list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
+ if (!params->num_actors)
+ return -ENODEV;
+
+ list_for_each_entry(instance, &tz->thermal_instances, tz_node)
if ((instance->trip == params->trip_max) &&
- cdev_is_power_actor(instance->cdev)) {
- num_actors++;
+ cdev_is_power_actor(instance->cdev))
total_weight += instance->weight;
- }
- }
-
- if (!num_actors)
- return -ENODEV;

- /*
- * We need to allocate five arrays of the same size:
- * req_power, max_power, granted_power, extra_actor_power and
- * weighted_req_power. They are going to be needed until this
- * function returns. Allocate them all in one go to simplify
- * the allocation and deallocation logic.
- */
- BUILD_BUG_ON(sizeof(*req_power) != sizeof(*max_power));
- BUILD_BUG_ON(sizeof(*req_power) != sizeof(*granted_power));
- BUILD_BUG_ON(sizeof(*req_power) != sizeof(*extra_actor_power));
- BUILD_BUG_ON(sizeof(*req_power) != sizeof(*weighted_req_power));
- req_power = kcalloc(num_actors * 5, sizeof(*req_power), GFP_KERNEL);
- if (!req_power)
- return -ENOMEM;
+ /* Clean all buffers for new power estimations */
+ memset(params->req_power, 0, params->buffer_size);

- max_power = &req_power[num_actors];
- granted_power = &req_power[2 * num_actors];
- extra_actor_power = &req_power[3 * num_actors];
- weighted_req_power = &req_power[4 * num_actors];
+ req_power = params->req_power;
+ max_power = params->max_power;
+ granted_power = params->granted_power;
+ extra_actor_power = params->extra_actor_power;
+ weighted_req_power = params->weighted_req_power;

list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
cdev = instance->cdev;
@@ -453,7 +452,7 @@ static int allocate_power(struct thermal_zone_device *tz, int control_temp)

power_range = pid_controller(tz, control_temp, max_allocatable_power);

- divvy_up_power(weighted_req_power, max_power, num_actors,
+ divvy_up_power(weighted_req_power, max_power, params->num_actors,
total_weighted_req_power, power_range, granted_power,
extra_actor_power);

@@ -474,12 +473,10 @@ static int allocate_power(struct thermal_zone_device *tz, int control_temp)

trace_thermal_power_allocator(tz, req_power, total_req_power,
granted_power, total_granted_power,
- num_actors, power_range,
+ params->num_actors, power_range,
max_allocatable_power, tz->temperature,
control_temp - tz->temperature);

- kfree(req_power);
-
return 0;
}

@@ -606,6 +603,74 @@ static int check_power_actors(struct thermal_zone_device *tz,
return ret;
}

+static void _power_buffers_init(struct power_allocator_params *params,
+ u32 *req_power, u32 *max_power,
+ u32 *granted_power, u32 *extra_actor_power,
+ u32 *weighted_req_power)
+
+{
+ /* Setup internal buffers for power calculations. */
+ params->req_power = req_power;
+ params->max_power = max_power;
+ params->granted_power = granted_power;
+ params->extra_actor_power = extra_actor_power;
+ params->weighted_req_power = weighted_req_power;
+}
+
+static int allocate_actors_buffer(struct power_allocator_params *params,
+ int num_actors)
+{
+ u32 *req_power;
+ int ret;
+
+ kfree(params->req_power);
+
+ /* There might be no cooling devices yet. */
+ if (!num_actors) {
+ ret = -EINVAL;
+ goto clean_buffers;
+ }
+
+ req_power = kcalloc(num_actors * 5, sizeof(u32), GFP_KERNEL);
+ if (!req_power) {
+ ret = -ENOMEM;
+ goto clean_buffers;
+ }
+
+ params->num_actors = num_actors;
+ params->buffer_size = num_actors * 5 * sizeof(u32);
+
+ _power_buffers_init(params, req_power, &req_power[params->num_actors],
+ &req_power[2 * params->num_actors],
+ &req_power[3 * params->num_actors],
+ &req_power[4 * params->num_actors]);
+
+ return 0;
+
+clean_buffers:
+ params->num_actors = -EINVAL;
+ params->buffer_size = 0;
+ _power_buffers_init(params, NULL, NULL, NULL, NULL, NULL);
+ return ret;
+}
+
+static void power_allocator_instances_update(struct thermal_zone_device *tz)
+{
+ struct power_allocator_params *params = tz->governor_data;
+ struct thermal_instance *instance;
+ int num_actors = 0;
+
+ list_for_each_entry(instance, &tz->thermal_instances, tz_node)
+ if ((instance->trip == params->trip_max) &&
+ cdev_is_power_actor(instance->cdev))
+ num_actors++;
+
+ if (num_actors == params->num_actors)
+ return;
+
+ allocate_actors_buffer(params, num_actors);
+}
+
/**
* power_allocator_bind() - bind the power_allocator governor to a thermal zone
* @tz: thermal zone to bind it to
@@ -639,6 +704,13 @@ static int power_allocator_bind(struct thermal_zone_device *tz)
return ret;
}

+ ret = allocate_actors_buffer(params, ret);
+ if (ret) {
+ dev_warn(&tz->device, "power_allocator: allocation failed\n");
+ kfree(params);
+ return ret;
+ }
+
if (!tz->tzp) {
tz->tzp = kzalloc(sizeof(*tz->tzp), GFP_KERNEL);
if (!tz->tzp) {
@@ -663,6 +735,7 @@ static int power_allocator_bind(struct thermal_zone_device *tz)
return 0;

free_params:
+ kfree(params->req_power);
kfree(params);

return ret;
@@ -679,6 +752,7 @@ static void power_allocator_unbind(struct thermal_zone_device *tz)
tz->tzp = NULL;
}

+ kfree(params->req_power);
kfree(tz->governor_data);
tz->governor_data = NULL;
}
@@ -717,5 +791,6 @@ static struct thermal_governor thermal_gov_power_allocator = {
.bind_to_tz = power_allocator_bind,
.unbind_from_tz = power_allocator_unbind,
.throttle = power_allocator_throttle,
+ .instances_update = power_allocator_instances_update,
};
THERMAL_GOVERNOR_DECLARE(thermal_gov_power_allocator);
--
2.25.1

2023-12-06 11:31:23

by Lukasz Luba

[permalink] [raw]
Subject: [PATCH 4/5] thermal: gov_power_allocator: Simplify checks for valid power actor

There is a need to check if the cooling device in the thermal zone
supports IPA callback and is set for control trip point.
Refactor the code which validates the power actor capabilities and
make it more consistent in all places.

No intentional functional impact.

Signed-off-by: Lukasz Luba <[email protected]>
---
drivers/thermal/gov_power_allocator.c | 41 +++++++++++----------------
1 file changed, 17 insertions(+), 24 deletions(-)

diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c
index 1b55d00fc12b..8f2d2ee3def0 100644
--- a/drivers/thermal/gov_power_allocator.c
+++ b/drivers/thermal/gov_power_allocator.c
@@ -85,6 +85,13 @@ struct power_allocator_params {
u32 *weighted_req_power;
};

+static bool power_actor_is_valid(struct power_allocator_params *params,
+ struct thermal_instance *instance)
+{
+ return ((instance->trip == params->trip_max) &&
+ cdev_is_power_actor(instance->cdev));
+}
+
/**
* estimate_sustainable_power() - Estimate the sustainable power of a thermal zone
* @tz: thermal zone we are operating in
@@ -105,14 +112,10 @@ static u32 estimate_sustainable_power(struct thermal_zone_device *tz)
u32 min_power;

list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
- cdev = instance->cdev;
-
- if (instance->trip != params->trip_max)
- continue;
-
- if (!cdev_is_power_actor(cdev))
+ if (!power_actor_is_valid(params, instance))
continue;

+ cdev = instance->cdev;
if (cdev->ops->state2power(cdev, instance->upper, &min_power))
continue;

@@ -407,8 +410,7 @@ static int allocate_power(struct thermal_zone_device *tz, int control_temp)
return -ENODEV;

list_for_each_entry(instance, &tz->thermal_instances, tz_node)
- if ((instance->trip == params->trip_max) &&
- cdev_is_power_actor(instance->cdev))
+ if (power_actor_is_valid(params, instance))
total_weight += instance->weight;

/* Clean all buffers for new power estimations */
@@ -421,14 +423,10 @@ static int allocate_power(struct thermal_zone_device *tz, int control_temp)
weighted_req_power = params->weighted_req_power;

list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
- cdev = instance->cdev;
-
- if (instance->trip != params->trip_max)
- continue;
-
- if (!cdev_is_power_actor(cdev))
+ if (!power_actor_is_valid(params, instance))
continue;

+ cdev = instance->cdev;
if (cdev->ops->get_requested_power(cdev, &req_power[i]))
continue;

@@ -458,10 +456,7 @@ static int allocate_power(struct thermal_zone_device *tz, int control_temp)

i = 0;
list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
- if (instance->trip != params->trip_max)
- continue;
-
- if (!cdev_is_power_actor(instance->cdev))
+ if (!power_actor_is_valid(params, instance))
continue;

power_actor_set_power(instance->cdev, instance,
@@ -546,12 +541,11 @@ static void allow_maximum_power(struct thermal_zone_device *tz, bool update)
u32 req_power;

list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
- cdev = instance->cdev;
-
- if (instance->trip != params->trip_max ||
- !cdev_is_power_actor(instance->cdev))
+ if (!power_actor_is_valid(params, instance))
continue;

+ cdev = instance->cdev;
+
instance->target = 0;
mutex_lock(&cdev->lock);
/*
@@ -661,8 +655,7 @@ static void power_allocator_instances_update(struct thermal_zone_device *tz)
int num_actors = 0;

list_for_each_entry(instance, &tz->thermal_instances, tz_node)
- if ((instance->trip == params->trip_max) &&
- cdev_is_power_actor(instance->cdev))
+ if (power_actor_is_valid(params, instance))
num_actors++;

if (num_actors == params->num_actors)
--
2.25.1

2023-12-06 11:31:23

by Lukasz Luba

[permalink] [raw]
Subject: [PATCH 5/5] thermal: gov_power_allocator: Refactor checks in divvy_up_power()

Simplify the code and remove one extra 'if' block.

No intentional functional impact.

Signed-off-by: Lukasz Luba <[email protected]>
---
drivers/thermal/gov_power_allocator.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c
index 8f2d2ee3def0..350a39c23ac4 100644
--- a/drivers/thermal/gov_power_allocator.c
+++ b/drivers/thermal/gov_power_allocator.c
@@ -349,7 +349,8 @@ static void divvy_up_power(u32 *req_power, u32 *max_power, int num_actors,
u32 total_req_power, u32 power_range,
u32 *granted_power, u32 *extra_actor_power)
{
- u32 extra_power, capped_extra_power;
+ u32 capped_extra_power = 0;
+ u32 extra_power = 0;
int i;

/*
@@ -358,8 +359,6 @@ static void divvy_up_power(u32 *req_power, u32 *max_power, int num_actors,
if (!total_req_power)
total_req_power = 1;

- capped_extra_power = 0;
- extra_power = 0;
for (i = 0; i < num_actors; i++) {
u64 req_range = (u64)req_power[i] * power_range;

@@ -375,7 +374,7 @@ static void divvy_up_power(u32 *req_power, u32 *max_power, int num_actors,
capped_extra_power += extra_actor_power[i];
}

- if (!extra_power)
+ if (!extra_power || !capped_extra_power)
return;

/*
@@ -383,12 +382,13 @@ static void divvy_up_power(u32 *req_power, u32 *max_power, int num_actors,
* how far they are from the max
*/
extra_power = min(extra_power, capped_extra_power);
- if (capped_extra_power > 0)
- for (i = 0; i < num_actors; i++) {
- u64 extra_range = (u64)extra_actor_power[i] * extra_power;
- granted_power[i] += DIV_ROUND_CLOSEST_ULL(extra_range,
- capped_extra_power);
- }
+
+ for (i = 0; i < num_actors; i++) {
+ u64 extra_range = (u64)extra_actor_power[i] * extra_power;
+
+ granted_power[i] += DIV_ROUND_CLOSEST_ULL(extra_range,
+ capped_extra_power);
+ }
}

static int allocate_power(struct thermal_zone_device *tz, int control_temp)
--
2.25.1

2023-12-11 20:42:17

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 1/5] thermal: core: Add callback for governors with cooling instances change

On Wed, Dec 6, 2023 at 12:30 PM Lukasz Luba <[email protected]> wrote:
>
> Allow governors to react to the changes in the cooling instances list.
> That makes possible to move memory allocations related to the number of
> cooling instances out of the throttle() callback. The throttle() callback
> is called much more often thus the cost of managing allocations there is
> an extra overhead. The list of cooling instances is not changed that often
> and it can be handled in dedicated callback. That will save CPU cycles
> in the throttle() code path. Both callbacks code paths are protected with
> the same thermal zone lock, which guaranties the list of cooling instances
> is consistent.
>
> Signed-off-by: Lukasz Luba <[email protected]>

I agree with the direction, but I'm wondering if changes of the
bindings between trip points and cooling devices are the only type of
changes which can affect the IPA. For instance, what if the trip
point temperatures are updated?

If it needs to react to other types of changes in general, it may be
good to introduce a more general callback that can be made handle them
in the future.

> ---
> drivers/thermal/thermal_core.c | 14 ++++++++++++++
> include/linux/thermal.h | 4 ++++
> 2 files changed, 18 insertions(+)
>
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index 625ba07cbe2f..c993b86f7fb5 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -314,6 +314,15 @@ static void handle_non_critical_trips(struct thermal_zone_device *tz,
> def_governor->throttle(tz, trip);
> }
>
> +static void handle_instances_list_update(struct thermal_zone_device *tz)
> +{
> +
> + if (!tz->governor || !tz->governor->instances_update)
> + return;
> +
> + tz->governor->instances_update(tz);
> +}
> +
> void thermal_zone_device_critical(struct thermal_zone_device *tz)
> {
> /*
> @@ -723,6 +732,8 @@ int thermal_bind_cdev_to_trip(struct thermal_zone_device *tz,
> list_add_tail(&dev->tz_node, &tz->thermal_instances);
> list_add_tail(&dev->cdev_node, &cdev->thermal_instances);
> atomic_set(&tz->need_update, 1);
> +
> + handle_instances_list_update(tz);
> }
> mutex_unlock(&cdev->lock);
> mutex_unlock(&tz->lock);
> @@ -781,6 +792,9 @@ int thermal_unbind_cdev_from_trip(struct thermal_zone_device *tz,
> if (pos->tz == tz && pos->trip == trip && pos->cdev == cdev) {
> list_del(&pos->tz_node);
> list_del(&pos->cdev_node);
> +
> + handle_instances_list_update(tz);
> +
> mutex_unlock(&cdev->lock);
> mutex_unlock(&tz->lock);
> goto unbind;
> diff --git a/include/linux/thermal.h b/include/linux/thermal.h
> index c7190e2dfcb4..e7b2a1f4bab0 100644
> --- a/include/linux/thermal.h
> +++ b/include/linux/thermal.h
> @@ -195,6 +195,9 @@ struct thermal_zone_device {
> * thermal zone.
> * @throttle: callback called for every trip point even if temperature is
> * below the trip point temperature
> + * @instances_update: callback called when thermal zone instances list
> + * i has changed (e.g. added new or removed), which
> + * may help to offload work for governor like allocations
> * @governor_list: node in thermal_governor_list (in thermal_core.c)
> */
> struct thermal_governor {
> @@ -203,6 +206,7 @@ struct thermal_governor {
> void (*unbind_from_tz)(struct thermal_zone_device *tz);
> int (*throttle)(struct thermal_zone_device *tz,
> const struct thermal_trip *trip);
> + void (*instances_update)(struct thermal_zone_device *tz);

So this could be more general I think, something like (*update_tz)(),
and it may take an additional argument representing the type of the
change.

> struct list_head governor_list;
> };
>
> --
> 2.25.1
>

2023-12-12 08:37:35

by Lukasz Luba

[permalink] [raw]
Subject: Re: [PATCH 1/5] thermal: core: Add callback for governors with cooling instances change



On 12/11/23 20:41, Rafael J. Wysocki wrote:
> On Wed, Dec 6, 2023 at 12:30 PM Lukasz Luba <[email protected]> wrote:
>>
>> Allow governors to react to the changes in the cooling instances list.
>> That makes possible to move memory allocations related to the number of
>> cooling instances out of the throttle() callback. The throttle() callback
>> is called much more often thus the cost of managing allocations there is
>> an extra overhead. The list of cooling instances is not changed that often
>> and it can be handled in dedicated callback. That will save CPU cycles
>> in the throttle() code path. Both callbacks code paths are protected with
>> the same thermal zone lock, which guaranties the list of cooling instances
>> is consistent.
>>
>> Signed-off-by: Lukasz Luba <[email protected]>
>
> I agree with the direction, but I'm wondering if changes of the
> bindings between trip points and cooling devices are the only type of
> changes which can affect the IPA. For instance, what if the trip
> point temperatures are updated?

Yes, that example sounds also interesting for a callback. The trip
temperature update won't affect the allocation/free code, though.

>
> If it needs to react to other types of changes in general, it may be
> good to introduce a more general callback that can be made handle them
> in the future.

Fair enough.

>
>> ---
>> drivers/thermal/thermal_core.c | 14 ++++++++++++++
>> include/linux/thermal.h | 4 ++++
>> 2 files changed, 18 insertions(+)
>>
>> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
>> index 625ba07cbe2f..c993b86f7fb5 100644
>> --- a/drivers/thermal/thermal_core.c
>> +++ b/drivers/thermal/thermal_core.c
>> @@ -314,6 +314,15 @@ static void handle_non_critical_trips(struct thermal_zone_device *tz,
>> def_governor->throttle(tz, trip);
>> }
>>
>> +static void handle_instances_list_update(struct thermal_zone_device *tz)
>> +{
>> +
>> + if (!tz->governor || !tz->governor->instances_update)
>> + return;
>> +
>> + tz->governor->instances_update(tz);
>> +}
>> +
>> void thermal_zone_device_critical(struct thermal_zone_device *tz)
>> {
>> /*
>> @@ -723,6 +732,8 @@ int thermal_bind_cdev_to_trip(struct thermal_zone_device *tz,
>> list_add_tail(&dev->tz_node, &tz->thermal_instances);
>> list_add_tail(&dev->cdev_node, &cdev->thermal_instances);
>> atomic_set(&tz->need_update, 1);
>> +
>> + handle_instances_list_update(tz);
>> }
>> mutex_unlock(&cdev->lock);
>> mutex_unlock(&tz->lock);
>> @@ -781,6 +792,9 @@ int thermal_unbind_cdev_from_trip(struct thermal_zone_device *tz,
>> if (pos->tz == tz && pos->trip == trip && pos->cdev == cdev) {
>> list_del(&pos->tz_node);
>> list_del(&pos->cdev_node);
>> +
>> + handle_instances_list_update(tz);
>> +
>> mutex_unlock(&cdev->lock);
>> mutex_unlock(&tz->lock);
>> goto unbind;
>> diff --git a/include/linux/thermal.h b/include/linux/thermal.h
>> index c7190e2dfcb4..e7b2a1f4bab0 100644
>> --- a/include/linux/thermal.h
>> +++ b/include/linux/thermal.h
>> @@ -195,6 +195,9 @@ struct thermal_zone_device {
>> * thermal zone.
>> * @throttle: callback called for every trip point even if temperature is
>> * below the trip point temperature
>> + * @instances_update: callback called when thermal zone instances list
>> + * i has changed (e.g. added new or removed), which
>> + * may help to offload work for governor like allocations
>> * @governor_list: node in thermal_governor_list (in thermal_core.c)
>> */
>> struct thermal_governor {
>> @@ -203,6 +206,7 @@ struct thermal_governor {
>> void (*unbind_from_tz)(struct thermal_zone_device *tz);
>> int (*throttle)(struct thermal_zone_device *tz,
>> const struct thermal_trip *trip);
>> + void (*instances_update)(struct thermal_zone_device *tz);
>
> So this could be more general I think, something like (*update_tz)(),
> and it may take an additional argument representing the type of the
> change.

I agree. I'll send next version.

There is one candidate which could instantly be added to this
update reasons list:
cooling devices weight change via sysfs [1]

Thanks for the review comments!

[1]
https://elixir.bootlin.com/linux/latest/source/drivers/thermal/thermal_sysfs.c#L959

2023-12-20 14:08:20

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 2/5] thermal: gov_power_allocator: Refactor check_power_actors()

On Wed, Dec 6, 2023 at 12:30 PM Lukasz Luba <[email protected]> wrote:
>
> Refactor check_power_actors() to make it possible for re-use in the
> upcoming new callback.

I would say "In preparation for a subsequent change, rearrange
check_power_actors()".
>
> No intentional functional impact.
>
> Signed-off-by: Lukasz Luba <[email protected]>
> ---
> drivers/thermal/gov_power_allocator.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c
> index 785fff14223d..38e1e89ba10c 100644
> --- a/drivers/thermal/gov_power_allocator.c
> +++ b/drivers/thermal/gov_power_allocator.c
> @@ -581,8 +581,9 @@ static void allow_maximum_power(struct thermal_zone_device *tz, bool update)
> * power actor API. The warning should help to investigate the issue, which
> * could be e.g. lack of Energy Model for a given device.
> *
> - * Return: 0 on success, -EINVAL if any cooling device does not implement
> - * the power actor API.
> + * Return number of cooling devices or -EINVAL if any cooling device does not
> + * implement the power actor API. Return value 0 is also valid since cooling
> + * devices might be attached later.

I would say "If all of the cooling devices currently attached to @tz
implement the power actor API, return the number of them (which may be
0, because some cooling devices may be attached later). Otherwise,
return -EINVAL."

> */
> static int check_power_actors(struct thermal_zone_device *tz,
> struct power_allocator_params *params)
> @@ -597,8 +598,9 @@ static int check_power_actors(struct thermal_zone_device *tz,
> if (!cdev_is_power_actor(instance->cdev)) {
> dev_warn(&tz->device, "power_allocator: %s is not a power actor\n",
> instance->cdev->type);
> - ret = -EINVAL;
> + return -EINVAL;
> }
> + ret++;
> }
>
> return ret;
> @@ -631,7 +633,7 @@ static int power_allocator_bind(struct thermal_zone_device *tz)
> }
>
> ret = check_power_actors(tz, params);
> - if (ret) {
> + if (ret < 0) {
> dev_warn(&tz->device, "power_allocator: binding failed\n");
> kfree(params);
> return ret;
> --

2023-12-20 16:28:30

by Lukasz Luba

[permalink] [raw]
Subject: Re: [PATCH 2/5] thermal: gov_power_allocator: Refactor check_power_actors()



On 12/20/23 14:07, Rafael J. Wysocki wrote:
> On Wed, Dec 6, 2023 at 12:30 PM Lukasz Luba <[email protected]> wrote:
>>
>> Refactor check_power_actors() to make it possible for re-use in the
>> upcoming new callback.
>
> I would say "In preparation for a subsequent change, rearrange
> check_power_actors()".

Agree, I'll use it.

>>
>> No intentional functional impact.
>>
>> Signed-off-by: Lukasz Luba <[email protected]>
>> ---
>> drivers/thermal/gov_power_allocator.c | 10 ++++++----
>> 1 file changed, 6 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c
>> index 785fff14223d..38e1e89ba10c 100644
>> --- a/drivers/thermal/gov_power_allocator.c
>> +++ b/drivers/thermal/gov_power_allocator.c
>> @@ -581,8 +581,9 @@ static void allow_maximum_power(struct thermal_zone_device *tz, bool update)
>> * power actor API. The warning should help to investigate the issue, which
>> * could be e.g. lack of Energy Model for a given device.
>> *
>> - * Return: 0 on success, -EINVAL if any cooling device does not implement
>> - * the power actor API.
>> + * Return number of cooling devices or -EINVAL if any cooling device does not
>> + * implement the power actor API. Return value 0 is also valid since cooling
>> + * devices might be attached later.
>
> I would say "If all of the cooling devices currently attached to @tz
> implement the power actor API, return the number of them (which may be
> 0, because some cooling devices may be attached later). Otherwise,
> return -EINVAL."
>

Yes, I'll use that sentence as well.
Those will be in the next version (v3). Thanks!