2024-05-24 14:32:05

by Alexandre Bailon

[permalink] [raw]
Subject: [PATCH v3 0/6] thermal: Add support of multiple sensors

Following this comment [1], this updates thermal_of to support multiple
sensors.

This series intends to add support of thermal aggregation.
One use case for it is using the IPA in the case we have
multiple sensors for one performance domain.

This has been tested on the mt8195 using s-tui.
To test and validate, we heat up the CPU and the heat sink.
At some point, we run benchmark tests with different configurations:
- Mediatek kernel (IPA + their own thermal aggregation)
- Mainline kernel
- Mainline kernel with IPA and aggregation enabled
With the IPA and the aggregation enabled, we get the best performances
with the most stable CPU temperature.

The aggregation is configured and enabled using device tree.
One thermal zone has to be created with a list of sensors.
It will take care of registering a thermal zone for each sensors.
The cooling device will only be registered with the aggregating thermal zone.

There are still something important missing: a way to check that all
aggregated sensors are part of the same performance domain.
So far, I don't see how this should be done. Some recommendations would be
appreciated.

Changes in v2:
- Rebased on 6.7
- Seperated generic multi sensor and dt specfic code
- Simplified the code
- Drop min / max and only do weighted average (seems more adequate for IPA)

Changes in v3:
- Rebased on 6.9
- Reworked the way to register a multi sensor thermal zone
- Only one thermal zone to define in device tree
- Max has been re-added
- Enabled it on mt8195

[1]: https://patchwork.kernel.org/comment/24723927/

Alexandre Bailon (6):
dt-bindings: thermal: Restore the thermal-sensors property
thermal: Add support of multi sensors to thermal_core
thermal: Add support of multi sensors to thermal_of
dt-bindings: thermal: Add a property to select the aggregation type
thermal: of: Parse aggregation property to select the aggegration type
ARM64: mt8195: Use thermal aggregation for big and little cpu

.../bindings/thermal/thermal-zones.yaml | 13 +-
arch/arm64/boot/dts/mediatek/mt8195.dtsi | 212 ++---------
drivers/thermal/Makefile | 1 +
drivers/thermal/thermal_core.h | 15 +
drivers/thermal/thermal_multi.c | 332 ++++++++++++++++++
drivers/thermal/thermal_of.c | 282 ++++++++++++++-
include/uapi/linux/thermal.h | 5 +
7 files changed, 663 insertions(+), 197 deletions(-)
create mode 100644 drivers/thermal/thermal_multi.c

--
2.44.1



2024-05-24 14:32:17

by Alexandre Bailon

[permalink] [raw]
Subject: [PATCH v3 1/6] dt-bindings: thermal: Restore the thermal-sensors property

thermal-sensors was defined in thermal.txt but when the yaml binding
has been defined, its definition has changed, dropping support of multi
sensors.
Since we are adding support of multi sensors, use the original definition
for thermal-sensors property.

Signed-off-by: Alexandre Bailon <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
---
Documentation/devicetree/bindings/thermal/thermal-zones.yaml | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
index 68398e7e8655..fa7a72e2ba44 100644
--- a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
+++ b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
@@ -93,10 +93,9 @@ patternProperties:

thermal-sensors:
$ref: /schemas/types.yaml#/definitions/phandle-array
- maxItems: 1
description:
- The thermal sensor phandle and sensor specifier used to monitor this
- thermal zone.
+ A list of thermal sensor phandles and sensor specifier
+ used while monitoring the thermal zone.

coefficients:
$ref: /schemas/types.yaml#/definitions/uint32-array
--
2.44.1


2024-05-24 14:32:36

by Alexandre Bailon

[permalink] [raw]
Subject: [PATCH v3 2/6] thermal: Add support of multi sensors to thermal_core

This adds support of multi sensors to thermal.
Currently, this only support the get_temp operation.
This returns an average temperature of all the sensors.
If defined, a coefficient is applied to the value read from the sensor
before computing the average.

Signed-off-by: Alexandre Bailon <[email protected]>
---
drivers/thermal/Makefile | 1 +
drivers/thermal/thermal_core.h | 15 ++
drivers/thermal/thermal_multi.c | 332 ++++++++++++++++++++++++++++++++
include/uapi/linux/thermal.h | 5 +
4 files changed, 353 insertions(+)
create mode 100644 drivers/thermal/thermal_multi.c

diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
index 5cdf7d68687f..872190f9062b 100644
--- a/drivers/thermal/Makefile
+++ b/drivers/thermal/Makefile
@@ -6,6 +6,7 @@ CFLAGS_thermal_core.o := -I$(src)
obj-$(CONFIG_THERMAL) += thermal_sys.o
thermal_sys-y += thermal_core.o thermal_sysfs.o
thermal_sys-y += thermal_trip.o thermal_helpers.o
+thermal_sys-y += thermal_multi.o

# netlink interface to manage the thermal framework
thermal_sys-$(CONFIG_THERMAL_NETLINK) += thermal_netlink.o
diff --git a/drivers/thermal/thermal_core.h b/drivers/thermal/thermal_core.h
index 0d8a42bb7ce8..224735b644bc 100644
--- a/drivers/thermal/thermal_core.h
+++ b/drivers/thermal/thermal_core.h
@@ -142,6 +142,21 @@ ssize_t weight_show(struct device *, struct device_attribute *, char *);
ssize_t weight_store(struct device *, struct device_attribute *, const char *,
size_t);

+/* Multi sensors */
+struct thermal_zone_device *thermal_multi_sensor_find_tz(const char *type);
+struct thermal_zone_device_ops *thermal_multi_sensor_alloc_ops(int aggr_type);
+struct thermal_zone_device *thermal_multi_sensor_tz_alloc(const char *type,
+ struct thermal_trip *trips,
+ int num_trips,
+ struct thermal_zone_device_ops *ops,
+ int passive_delay, int polling_delay);
+void thermal_multi_sensor_tz_free(struct thermal_zone_device *tz);
+int thermal_multi_sensor_validate_coeff(int *coeff, int count, int offset);
+int thermal_multi_sensor_register(struct thermal_zone_device *tz,
+ struct thermal_zone_device *sensor_tz, int coeff);
+void thermal_multi_sensor_unregister(struct thermal_zone_device *sensor_tz);
+
+
#ifdef CONFIG_THERMAL_STATISTICS
void thermal_cooling_device_stats_update(struct thermal_cooling_device *cdev,
unsigned long new_state);
diff --git a/drivers/thermal/thermal_multi.c b/drivers/thermal/thermal_multi.c
new file mode 100644
index 000000000000..cee0ded8dc25
--- /dev/null
+++ b/drivers/thermal/thermal_multi.c
@@ -0,0 +1,332 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/err.h>
+#include <linux/export.h>
+#include <linux/of.h>
+#include <linux/slab.h>
+#include <linux/thermal.h>
+#include <linux/types.h>
+#include <linux/string.h>
+
+#include "thermal_core.h"
+
+struct sensor_interface {
+ struct thermal_zone_device *tz;
+ int coeff;
+
+ struct list_head node;
+};
+
+struct multi_sensor_thermal_zone {
+ struct thermal_zone_device *tz;
+ struct mutex sensors_lock;
+ struct list_head sensors;
+
+ struct list_head node;
+};
+
+static DEFINE_MUTEX(multi_tz_mutex);
+static LIST_HEAD(multi_tz_list);
+
+#define TJ_MAX 120000
+
+static int multi_sensor_get_temp(struct thermal_zone_device *tz, int *temp)
+{
+ struct multi_sensor_thermal_zone *multi_tz = tz->devdata;
+ struct sensor_interface *sensor;
+ int accumulated_temp = 0;
+ u32 accumulated_coeff = 0;
+ int ret;
+
+ mutex_lock(&multi_tz->sensors_lock);
+
+ if (list_empty(&multi_tz->sensors)) {
+ mutex_unlock(&multi_tz->sensors_lock);
+ return -ENODEV;
+ }
+
+ list_for_each_entry(sensor, &multi_tz->sensors, node) {
+ ret = thermal_zone_get_temp(sensor->tz, temp);
+ if (ret) {
+ mutex_unlock(&multi_tz->sensors_lock);
+ return ret;
+ }
+
+ accumulated_temp += *temp * sensor->coeff;
+ accumulated_coeff += sensor->coeff;
+ }
+
+ mutex_unlock(&multi_tz->sensors_lock);
+
+ *temp = accumulated_temp / accumulated_coeff;
+ return ret;
+}
+
+static int multi_sensor_get_temp_max(struct thermal_zone_device *tz, int *temp)
+{
+ struct multi_sensor_thermal_zone *multi_tz = tz->devdata;
+ struct sensor_interface *sensor;
+ int max_temp;
+ int ret;
+
+ mutex_lock(&multi_tz->sensors_lock);
+
+ if (list_empty(&multi_tz->sensors)) {
+ mutex_unlock(&multi_tz->sensors_lock);
+ return -ENODEV;
+ }
+
+ list_for_each_entry(sensor, &multi_tz->sensors, node) {
+ ret = thermal_zone_get_temp(sensor->tz, temp);
+ if (ret) {
+ mutex_unlock(&multi_tz->sensors_lock);
+ return ret;
+ }
+
+ max_temp = max(max_temp, *temp * sensor->coeff);
+ }
+
+ mutex_unlock(&multi_tz->sensors_lock);
+
+ *temp = max_temp;
+ return ret;
+}
+
+/**
+ * Check if the sum of the coefficients multiplied by sensors temperature plus
+ * an offset won't overflow during the aggregation.
+ * @coeff: An array of coefficient
+ * @count: Number of coefficient
+ * @offset: The offset
+ *
+ * Returns: 0 if the coefficient are safe, -EOVERFLOW otherwise
+ */
+int thermal_multi_sensor_validate_coeff(int *coeff, int count, int offset)
+{
+ int max_accumulated_temp = 0;
+ int i;
+
+ for (i = 0; i < count; i++) {
+ max_accumulated_temp += TJ_MAX * coeff[i];
+ if (max_accumulated_temp < 0)
+ return -EOVERFLOW;
+ }
+
+ max_accumulated_temp += offset;
+ return max_accumulated_temp < 0 ? -EOVERFLOW : 0;
+}
+
+/**
+ * Find a multi sensor thermal zone
+ * @type: The thermal zone type to find
+ *
+ * Returns: a pointer to the thermal zone or NULL if not found
+ */
+struct thermal_zone_device *thermal_multi_sensor_find_tz(const char *type)
+{
+ struct thermal_zone_device *tz;
+
+ tz = thermal_zone_get_zone_by_name(type);
+ if (IS_ERR(tz))
+ return NULL;
+ return tz;
+}
+
+/**
+ * Allocate a struct thermal_zone_device_ops for the multi sensor thermal zoen
+ *
+ * This allocates a struct thermal_zone_device_ops with a predifiend get_temp
+ * operation. This allows setting the other function pointers before registering
+ * the thermal zone.
+ *
+ * @aggr_type: The aggregation type to use (THERMAL_AGGR_AVG or THERMAL_AGGR_MAX)
+ *
+ * Returns: a pointer to the created struct thermal_zone_device_ops or an
+ * in case of error, an ERR_PTR. Caller must check return value with
+ * IS_ERR*() helpers.
+ */
+struct thermal_zone_device_ops *thermal_multi_sensor_alloc_ops(int aggr_type)
+{
+ struct thermal_zone_device_ops *ops;
+
+ ops = kzalloc(sizeof(*ops), GFP_KERNEL);
+ if (!ops)
+ return ERR_PTR(-ENOMEM);
+
+ switch (aggr_type) {
+ case THERMAL_AGGR_AVG:
+ ops->get_temp = multi_sensor_get_temp;
+ break;
+ case THERMAL_AGGR_MAX:
+ ops->get_temp = multi_sensor_get_temp_max;
+ break;
+ default:
+ kfree(ops);
+ return ERR_PTR(-EINVAL);
+ }
+
+ return ops;
+}
+
+/**
+ * Register a new thermal zone device that supports multi sensors
+ * @type: the thermal zone device type
+ * @trips: a pointer to an array of thermal trips
+ * @num_trips: the number of trip points the thermal zone support
+ * @mask: a bit string indicating the writeablility of trip points
+ * @ops: standard thermal zone device callbacks
+ * @passive_delay: number of milliseconds to wait between polls when
+ * performing passive cooling
+ * @polling_delay: number of milliseconds to wait between polls when checking
+ * whether trip points have been crossed (0 for interrupt
+ * driven systems)
+ *
+ * This function allocates and register a multi sensor thermal zone.
+ * To register a sensor to this thermal zone, use thermal_multi_sensor_register().
+ * thermal_multi_sensor_unregister() must be called to unregister the sensors
+ * and release this thermal zone when it is not used anymore.
+ *
+ * Return: a pointer to the created struct thermal_zone_device or an
+ * in case of error, an ERR_PTR. Caller must check return value with
+ * IS_ERR*() helpers.
+ */
+struct thermal_zone_device *thermal_multi_sensor_tz_alloc(const char *type,
+ struct thermal_trip *trips,
+ int num_trips,
+ struct thermal_zone_device_ops *ops,
+ int passive_delay, int polling_delay)
+{
+ struct thermal_zone_device *tz;
+ struct thermal_zone_params tzp = {};
+ struct multi_sensor_thermal_zone *multi_tz;
+
+ mutex_lock(&multi_tz_mutex);
+
+ tz = thermal_zone_get_zone_by_name(type);
+ if (!IS_ERR(tz))
+ goto unlock;
+
+ multi_tz = kzalloc(sizeof(*multi_tz), GFP_KERNEL);
+ if (!multi_tz) {
+ tz = ERR_PTR(-ENOMEM);
+ goto unlock;
+ }
+ mutex_init(&multi_tz->sensors_lock);
+ INIT_LIST_HEAD(&multi_tz->sensors);
+
+ tzp.no_hwmon = true;
+ tzp.slope = 1;
+ tzp.offset = 0;
+
+ tz = thermal_zone_device_register_with_trips(type, trips, num_trips,
+ multi_tz, ops, &tzp,
+ passive_delay, polling_delay);
+ if (IS_ERR(tz)) {
+ kfree(multi_tz);
+ } else {
+ multi_tz->tz = tz;
+ list_add(&multi_tz->node, &multi_tz_list);
+ }
+
+unlock:
+ mutex_unlock(&multi_tz_mutex);
+ return tz;
+}
+
+/**
+ * Remove all sensors from multi sensor thermal zone and release it
+ *
+ * This function must not be used except on error path to correctly
+ * release all the allocated resources.
+ * Use thermal_multi_sensor_unregister() to unregister a sensor and
+ * release a thermal zone that is not used anymore.
+ *
+ * @tz: Pointer to thermal zone to release
+ */
+void thermal_multi_sensor_tz_free(struct thermal_zone_device *tz)
+{
+ struct multi_sensor_thermal_zone *multi_tz = tz->devdata;
+ struct thermal_trip *trips = tz->trips;
+ struct thermal_zone_device_ops *ops = &tz->ops;
+ struct sensor_interface *sensor, *tmp;
+
+ list_for_each_entry_safe(sensor, tmp, &multi_tz->sensors, node) {
+ list_del(&sensor->node);
+ kfree(sensor);
+ }
+
+ thermal_zone_device_unregister(tz);
+ list_del(&multi_tz->node);
+ kfree(multi_tz);
+ kfree(trips);
+ kfree(ops);
+}
+
+/**
+ * Register a thermal sensor to a multi sensor thermal zone
+ * @tz: The multi sensor thermal zone
+ * @sensor_tz: The thermal zone of the zensor to register
+ * @coeff: The coefficient to apply to the temperature returned by the sensor
+ *
+ * Returns: On success 0, a negative value in case of error
+ */
+int thermal_multi_sensor_register(struct thermal_zone_device *tz,
+ struct thermal_zone_device *sensor_tz,
+ int coeff)
+{
+ struct multi_sensor_thermal_zone *multi_tz;
+ struct sensor_interface *sensor;
+
+ mutex_lock(&multi_tz_mutex);
+
+ multi_tz = tz->devdata;
+
+ sensor = kzalloc(sizeof(*sensor), GFP_KERNEL);
+ if (!sensor) {
+ mutex_unlock(&multi_tz_mutex);
+ return -ENOMEM;
+ }
+
+ sensor->tz = sensor_tz;
+ sensor->coeff = coeff;
+ mutex_lock(&multi_tz->sensors_lock);
+ list_add(&sensor->node, &multi_tz->sensors);
+ mutex_unlock(&multi_tz->sensors_lock);
+
+ thermal_zone_device_enable(tz);
+
+ mutex_unlock(&multi_tz_mutex);
+
+ return 0;
+}
+
+/**
+ * Unregister a thermal sensor from a multi sensor thermal zone
+ *
+ * This unregister a thermal sensor from a multi sensor thermal zone.
+ * If all the sensors have been removed then this also release the multi sensor
+ * thermal zone.
+ * @sensor_tz: The sensor to unregister
+ */
+void thermal_multi_sensor_unregister(struct thermal_zone_device *sensor_tz)
+{
+ struct multi_sensor_thermal_zone *multi_tz, *tmp_tz;
+ struct sensor_interface *sensor, *tmp;
+
+ mutex_lock(&multi_tz_mutex);
+ list_for_each_entry_safe(multi_tz, tmp_tz, &multi_tz_list, node) {
+ mutex_lock(&multi_tz->sensors_lock);
+ list_for_each_entry_safe(sensor, tmp, &multi_tz->sensors, node) {
+ if (sensor->tz == sensor_tz) {
+ list_del(&sensor->node);
+ kfree(sensor);
+ break;
+ }
+ }
+ mutex_unlock(&multi_tz->sensors_lock);
+
+ if (list_empty(&multi_tz->sensors))
+ thermal_multi_sensor_tz_free(multi_tz->tz);
+ }
+ mutex_unlock(&multi_tz_mutex);
+}
diff --git a/include/uapi/linux/thermal.h b/include/uapi/linux/thermal.h
index fc78bf3aead7..e4f6c4c5e6fd 100644
--- a/include/uapi/linux/thermal.h
+++ b/include/uapi/linux/thermal.h
@@ -16,6 +16,11 @@ enum thermal_trip_type {
THERMAL_TRIP_CRITICAL,
};

+enum thermal_aggregation_type {
+ THERMAL_AGGR_AVG = 0,
+ THERMAL_AGGR_MAX = 1,
+};
+
/* Adding event notification support elements */
#define THERMAL_GENL_FAMILY_NAME "thermal"
#define THERMAL_GENL_VERSION 0x01
--
2.44.1


2024-05-24 14:32:41

by Alexandre Bailon

[permalink] [raw]
Subject: [PATCH v3 3/6] thermal: Add support of multi sensors to thermal_of

This updates thermal_of to support more than one sensor.
If during the registration we find another thermal zone referencing
this sensors and some other, then we create the multi sensor thermal
zone (if it doesn't exist) and register the sensor to it.

Signed-off-by: Alexandre Bailon <[email protected]>
---
drivers/thermal/thermal_of.c | 250 +++++++++++++++++++++++++++++++++--
1 file changed, 241 insertions(+), 9 deletions(-)

diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
index aa34b6e82e26..75e3cfb8488a 100644
--- a/drivers/thermal/thermal_of.c
+++ b/drivers/thermal/thermal_of.c
@@ -18,6 +18,8 @@

#include "thermal_core.h"

+#define STRLEN_ID (8)
+
/*** functions parsing device tree nodes ***/

static int of_find_trip_id(struct device_node *np, struct device_node *trip)
@@ -222,6 +224,77 @@ static struct device_node *of_thermal_zone_find(struct device_node *sensor, int
return tz;
}

+static int thermal_of_multi_sensor_get_name(struct device_node *sensor, int id,
+ struct device_node *tz, char *name)
+{
+ struct of_phandle_args sensor_specs;
+ int count, i;
+
+ tz = of_thermal_zone_find(sensor, id);
+ if (!tz) {
+ pr_debug("No thermal zones description\n");
+ return -ENODEV;
+ }
+
+ count = of_count_phandle_with_args(tz, "thermal-sensors",
+ "#thermal-sensor-cells");
+ if (count <= 0)
+ return count;
+
+ for (i = 0; i < count; i++) {
+
+ int ret;
+
+ ret = of_parse_phandle_with_args(tz, "thermal-sensors",
+ "#thermal-sensor-cells",
+ i, &sensor_specs);
+ if (ret < 0) {
+ pr_err("%pOFn: Failed to read thermal-sensors cells: %d\n", tz, ret);
+ return ret;
+ }
+
+ if ((sensor == sensor_specs.np) && id == (sensor_specs.args_count ?
+ sensor_specs.args[0] : 0)) {
+ snprintf(name, THERMAL_NAME_LENGTH, "%s%d", tz->name, id);
+ return 0;
+ }
+ }
+
+ return -ENODEV;
+}
+
+static int thermal_of_multi_sensor_get_id(struct device_node *sensor,
+ struct device_node *tz, int id)
+{
+ struct of_phandle_args sensor_specs;
+ int count, i;
+
+ count = of_count_phandle_with_args(tz, "thermal-sensors",
+ "#thermal-sensor-cells");
+ if (count <= 0)
+ return 0;
+
+ for (i = 0; i < count; i++) {
+
+ int ret;
+
+ ret = of_parse_phandle_with_args(tz, "thermal-sensors",
+ "#thermal-sensor-cells",
+ i, &sensor_specs);
+ if (ret < 0) {
+ pr_err("%pOFn: Failed to read thermal-sensors cells: %d\n", tz, ret);
+ return 0;
+ }
+
+ if ((sensor == sensor_specs.np) && id == (sensor_specs.args_count ?
+ sensor_specs.args[0] : 0)) {
+ return i;
+ }
+ }
+
+ return -ENODEV;
+}
+
static int thermal_of_monitor_init(struct device_node *np, int *delay, int *pdelay)
{
int ret;
@@ -281,6 +354,17 @@ static struct device_node *thermal_of_zone_get_by_name(struct thermal_zone_devic
return ERR_PTR(-ENODEV);

tz_np = of_get_child_by_name(np, tz->type);
+ if (!tz_np) {
+ char tmp[THERMAL_NAME_LENGTH];
+ char *ptr;
+
+ ptr = strrchr(tz->type, '.');
+ if (!ptr)
+ return ERR_PTR(-ENODEV);
+
+ strscpy(tmp, tz->type, (ptr - tz->type) + 1);
+ tz_np = of_get_child_by_name(np, tmp);
+ }

of_node_put(np);

@@ -444,10 +528,140 @@ static int thermal_of_unbind(struct thermal_zone_device *tz,
*/
static void thermal_of_zone_unregister(struct thermal_zone_device *tz)
{
+ thermal_multi_sensor_unregister(tz);
thermal_zone_device_disable(tz);
thermal_zone_device_unregister(tz);
}

+static int thermal_of_multi_sensor_validate_coeff(struct device_node *sensor, int id,
+ struct device_node *tz_np)
+{
+ u32 *coeff;
+ int ret;
+ int i;
+
+ int count;
+ int index;
+ int offset;
+
+ index = thermal_of_multi_sensor_get_id(sensor, tz_np, id);
+ if (index < 0)
+ return -ENODEV;
+
+
+ count = of_count_phandle_with_args(tz_np,
+ "thermal-sensors",
+ "#thermal-sensor-cells");
+ if (count < 0)
+ return count;
+
+ coeff = kmalloc_array(count, sizeof(*coeff), GFP_KERNEL);
+ if (!coeff)
+ return -ENOMEM;
+
+ for (i = 0; i < count; i++) {
+ ret = of_property_read_u32_index(tz_np,
+ "coefficients",
+ i, coeff + i);
+ if (ret)
+ coeff[i] = 1;
+ }
+
+ ret = of_property_read_u32_index(tz_np, "coefficients",
+ count, &offset);
+ if (ret)
+ offset = 0;
+
+ /* Make sure the coeff and offset won't cause an overflow */
+ ret = thermal_multi_sensor_validate_coeff(coeff, count, offset);
+
+ kfree(coeff);
+
+ return ret;
+}
+
+static int thermal_of_mutli_sensor_coeff(struct device_node *sensor, int id,
+ struct device_node *tz_np,
+ u32 *coeff)
+{
+ int index;
+ int ret;
+
+ index = thermal_of_multi_sensor_get_id(sensor, tz_np, id);
+ if (index < 0)
+ return index;
+
+ ret = of_property_read_u32_index(tz_np, "coefficients", index, coeff);
+ if (ret)
+ *coeff = 1;
+
+ return 0;
+}
+
+static struct thermal_zone_device *
+thermal_of_register_multi_tz(struct device_node *sensor, int id, struct device_node *np,
+ const char *type, struct thermal_trip *trips, int num_trips,
+ void *devdata, struct thermal_zone_device_ops *ops,
+ const struct thermal_zone_params *tzp, int passive_delay,
+ int polling_delay)
+{
+ struct thermal_zone_device *multi_tz, *tz;
+ char name[THERMAL_NAME_LENGTH];
+ u32 coeff;
+ int ret;
+
+ multi_tz = thermal_multi_sensor_find_tz(type);
+ if (!multi_tz) {
+ struct thermal_zone_device_ops *multi_ops;
+
+ ret = thermal_of_multi_sensor_validate_coeff(sensor, id, np);
+ if (ret)
+ return ERR_PTR(ret);
+
+ multi_ops = thermal_multi_sensor_alloc_ops(THERMAL_AGGR_AVG);
+ if (IS_ERR_OR_NULL(multi_ops))
+ return ERR_PTR(PTR_ERR(multi_ops));
+ multi_ops->bind = thermal_of_bind;
+ multi_ops->unbind = thermal_of_unbind;
+
+ multi_tz = thermal_multi_sensor_tz_alloc(type, trips, num_trips,
+ multi_ops,
+ passive_delay, polling_delay);
+ if (IS_ERR_OR_NULL(multi_tz)) {
+ kfree(multi_ops);
+ return multi_tz;
+ }
+ }
+
+ ret = thermal_of_multi_sensor_get_name(sensor, id, np, name);
+ if (ret)
+ goto out_release_multi_tz;
+
+ tz = thermal_tripless_zone_device_register(name, devdata, ops, tzp);
+ if (IS_ERR_OR_NULL(tz)) {
+ ret = PTR_ERR(tz);
+ goto out_release_multi_tz;
+ }
+
+ ret = thermal_of_mutli_sensor_coeff(sensor, id, np, &coeff);
+ if (ret)
+ goto out_release_tz;
+
+ ret = thermal_multi_sensor_register(multi_tz, tz, coeff);
+ if (ret)
+ goto out_release_tz;
+
+ return tz;
+
+out_release_tz:
+ thermal_zone_device_unregister(tz);
+out_release_multi_tz:
+ thermal_multi_sensor_tz_free(multi_tz);
+
+ return ERR_PTR(ret);
+}
+
+
/**
* thermal_of_zone_register - Register a thermal zone with device node
* sensor
@@ -479,6 +693,7 @@ static struct thermal_zone_device *thermal_of_zone_register(struct device_node *
const char *action;
int delay, pdelay;
int ntrips;
+ int count;
int ret;

np = of_thermal_zone_find(sensor, id);
@@ -488,10 +703,19 @@ static struct thermal_zone_device *thermal_of_zone_register(struct device_node *
return ERR_CAST(np);
}

- trips = thermal_of_trips_init(np, &ntrips);
- if (IS_ERR(trips)) {
- pr_err("Failed to find trip points for %pOFn id=%d\n", sensor, id);
- return ERR_CAST(trips);
+ count = of_count_phandle_with_args(np, "thermal-sensors",
+ "#thermal-sensor-cells");
+ if (count <= 0)
+ return ERR_PTR(count);
+
+ /* Only allocate trips if the thermal zone doesn't exist yet */
+ if (!thermal_multi_sensor_find_tz(np->name)) {
+ trips = thermal_of_trips_init(np, &ntrips);
+ if (IS_ERR(trips)) {
+ pr_err("Failed to find trip points for %pOFn id=%d\n", sensor, id);
+ ret = PTR_ERR(trips);
+ goto out_kfree_trips;
+ }
}

ret = thermal_of_monitor_init(np, &delay, &pdelay);
@@ -502,17 +726,25 @@ static struct thermal_zone_device *thermal_of_zone_register(struct device_node *

thermal_of_parameters_init(np, &tzp);

- of_ops.bind = thermal_of_bind;
- of_ops.unbind = thermal_of_unbind;
+ if (count == 1) {
+ of_ops.bind = thermal_of_bind;
+ of_ops.unbind = thermal_of_unbind;
+ }

ret = of_property_read_string(np, "critical-action", &action);
if (!ret)
if (!of_ops.critical && !strcasecmp(action, "reboot"))
of_ops.critical = thermal_zone_device_critical_reboot;

- tz = thermal_zone_device_register_with_trips(np->name, trips, ntrips,
- data, &of_ops, &tzp,
- pdelay, delay);
+ if (count == 1) {
+ tz = thermal_zone_device_register_with_trips(np->name, trips, ntrips,
+ data, &of_ops, &tzp,
+ pdelay, delay);
+ } else {
+ tz = thermal_of_register_multi_tz(sensor, id, np, np->name, trips,
+ ntrips, data, &of_ops, &tzp,
+ pdelay, delay);
+ }
if (IS_ERR(tz)) {
ret = PTR_ERR(tz);
pr_err("Failed to register thermal zone %pOFn: %d\n", np, ret);
--
2.44.1


2024-05-24 14:33:18

by Alexandre Bailon

[permalink] [raw]
Subject: [PATCH v3 5/6] thermal: of: Parse aggregation property to select the aggegration type

This updates the driver to parse the aggegration property in DT.
This allows selecting the aggregation function to apply for multi
sensors thermal zone.

Signed-off-by: Alexandre Bailon <[email protected]>
---
drivers/thermal/thermal_of.c | 32 ++++++++++++++++++++++++++++++++
1 file changed, 32 insertions(+)

diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
index 75e3cfb8488a..21c81dc91a41 100644
--- a/drivers/thermal/thermal_of.c
+++ b/drivers/thermal/thermal_of.c
@@ -533,6 +533,33 @@ static void thermal_of_zone_unregister(struct thermal_zone_device *tz)
thermal_zone_device_unregister(tz);
}

+static const char * const aggr_types[] = {
+ [THERMAL_AGGR_AVG] = "avg",
+ [THERMAL_AGGR_MAX] = "max",
+};
+
+static int thermal_of_multi_sensor_get_type(struct device_node *np,
+ enum thermal_aggregation_type *type)
+{
+ const char *t;
+ int err, i;
+
+ err = of_property_read_string(np, "aggregation", &t);
+ if (err < 0) {
+ *type = THERMAL_AGGR_AVG;
+ return 0;
+ }
+
+ for (i = 0; i < ARRAY_SIZE(aggr_types); i++) {
+ if (!strcasecmp(t, aggr_types[i])) {
+ *type = i;
+ return 0;
+ }
+ }
+
+ return -EINVAL;
+}
+
static int thermal_of_multi_sensor_validate_coeff(struct device_node *sensor, int id,
struct device_node *tz_np)
{
@@ -606,6 +633,7 @@ thermal_of_register_multi_tz(struct device_node *sensor, int id, struct device_n
int polling_delay)
{
struct thermal_zone_device *multi_tz, *tz;
+ enum thermal_aggregation_type aggr_type;
char name[THERMAL_NAME_LENGTH];
u32 coeff;
int ret;
@@ -614,6 +642,10 @@ thermal_of_register_multi_tz(struct device_node *sensor, int id, struct device_n
if (!multi_tz) {
struct thermal_zone_device_ops *multi_ops;

+ ret = thermal_of_multi_sensor_get_type(np, &aggr_type);
+ if (ret)
+ return ERR_PTR(ret);
+
ret = thermal_of_multi_sensor_validate_coeff(sensor, id, np);
if (ret)
return ERR_PTR(ret);
--
2.44.1


2024-05-24 14:33:38

by Alexandre Bailon

[permalink] [raw]
Subject: [PATCH v3 6/6] ARM64: mt8195: Use thermal aggregation for big and little cpu

This uses the thermal aggregation for the mt8195 to get the maximal
temperature of big and little cpu clusters.

Signed-off-by: Alexandre Bailon <[email protected]>
---
arch/arm64/boot/dts/mediatek/mt8195.dtsi | 212 +++--------------------
1 file changed, 27 insertions(+), 185 deletions(-)

diff --git a/arch/arm64/boot/dts/mediatek/mt8195.dtsi b/arch/arm64/boot/dts/mediatek/mt8195.dtsi
index 5d8b68f86ce4..8aa2bf142622 100644
--- a/arch/arm64/boot/dts/mediatek/mt8195.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8195.dtsi
@@ -3600,50 +3600,31 @@ dp_tx: dp-tx@1c600000 {
};

thermal_zones: thermal-zones {
- cpu0-thermal {
+ cpu-little {
polling-delay = <1000>;
- polling-delay-passive = <250>;
- thermal-sensors = <&lvts_mcu MT8195_MCU_LITTLE_CPU0>;
+ polling-delay-passive = <100>;
+ thermal-sensors = <&lvts_mcu MT8195_MCU_LITTLE_CPU0>,
+ <&lvts_mcu MT8195_MCU_LITTLE_CPU1>,
+ <&lvts_mcu MT8195_MCU_LITTLE_CPU2>,
+ <&lvts_mcu MT8195_MCU_LITTLE_CPU3>;
+ sustainable-power = <1500>;
+ aggregation = "max";

trips {
- cpu0_alert: trip-alert {
- temperature = <85000>;
+ cpu_little_threshold: trip-point {
+ temperature = <68000>;
hysteresis = <2000>;
type = "passive";
};

- cpu0_crit: trip-crit {
- temperature = <100000>;
- hysteresis = <2000>;
- type = "critical";
- };
- };
-
- cooling-maps {
- map0 {
- trip = <&cpu0_alert>;
- cooling-device = <&cpu0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
- <&cpu1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
- <&cpu2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
- <&cpu3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
- };
- };
- };
-
- cpu1-thermal {
- polling-delay = <1000>;
- polling-delay-passive = <250>;
- thermal-sensors = <&lvts_mcu MT8195_MCU_LITTLE_CPU1>;
-
- trips {
- cpu1_alert: trip-alert {
+ cpu_little_target: target {
temperature = <85000>;
hysteresis = <2000>;
type = "passive";
};

- cpu1_crit: trip-crit {
- temperature = <100000>;
+ cpu_little_soc_max_crit: soc-max-crit {
+ temperature = <115000>;
hysteresis = <2000>;
type = "critical";
};
@@ -3651,7 +3632,7 @@ cpu1_crit: trip-crit {

cooling-maps {
map0 {
- trip = <&cpu1_alert>;
+ trip = <&cpu_little_target>;
cooling-device = <&cpu0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
<&cpu1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
<&cpu2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
@@ -3660,170 +3641,31 @@ map0 {
};
};

- cpu2-thermal {
+ cpu-big {
polling-delay = <1000>;
polling-delay-passive = <250>;
- thermal-sensors = <&lvts_mcu MT8195_MCU_LITTLE_CPU2>;
+ thermal-sensors = <&lvts_mcu MT8195_MCU_BIG_CPU0>,
+ <&lvts_mcu MT8195_MCU_BIG_CPU1>,
+ <&lvts_mcu MT8195_MCU_BIG_CPU2>,
+ <&lvts_mcu MT8195_MCU_BIG_CPU3>;
+ sustainable-power = <1500>;
+ aggregation = "max";

trips {
- cpu2_alert: trip-alert {
- temperature = <85000>;
+ cpu_big_threshold: trip-point {
+ temperature = <68000>;
hysteresis = <2000>;
type = "passive";
};

- cpu2_crit: trip-crit {
- temperature = <100000>;
- hysteresis = <2000>;
- type = "critical";
- };
- };
-
- cooling-maps {
- map0 {
- trip = <&cpu2_alert>;
- cooling-device = <&cpu0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
- <&cpu1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
- <&cpu2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
- <&cpu3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
- };
- };
- };
-
- cpu3-thermal {
- polling-delay = <1000>;
- polling-delay-passive = <250>;
- thermal-sensors = <&lvts_mcu MT8195_MCU_LITTLE_CPU3>;
-
- trips {
- cpu3_alert: trip-alert {
- temperature = <85000>;
- hysteresis = <2000>;
- type = "passive";
- };
-
- cpu3_crit: trip-crit {
- temperature = <100000>;
- hysteresis = <2000>;
- type = "critical";
- };
- };
-
- cooling-maps {
- map0 {
- trip = <&cpu3_alert>;
- cooling-device = <&cpu0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
- <&cpu1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
- <&cpu2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
- <&cpu3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
- };
- };
- };
-
- cpu4-thermal {
- polling-delay = <1000>;
- polling-delay-passive = <250>;
- thermal-sensors = <&lvts_mcu MT8195_MCU_BIG_CPU0>;
-
- trips {
- cpu4_alert: trip-alert {
+ cpu_big_target: target {
temperature = <85000>;
hysteresis = <2000>;
type = "passive";
};

- cpu4_crit: trip-crit {
- temperature = <100000>;
- hysteresis = <2000>;
- type = "critical";
- };
- };
-
- cooling-maps {
- map0 {
- trip = <&cpu4_alert>;
- cooling-device = <&cpu4 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
- <&cpu5 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
- <&cpu6 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
- <&cpu7 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
- };
- };
- };
-
- cpu5-thermal {
- polling-delay = <1000>;
- polling-delay-passive = <250>;
- thermal-sensors = <&lvts_mcu MT8195_MCU_BIG_CPU1>;
-
- trips {
- cpu5_alert: trip-alert {
- temperature = <85000>;
- hysteresis = <2000>;
- type = "passive";
- };
-
- cpu5_crit: trip-crit {
- temperature = <100000>;
- hysteresis = <2000>;
- type = "critical";
- };
- };
-
- cooling-maps {
- map0 {
- trip = <&cpu5_alert>;
- cooling-device = <&cpu4 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
- <&cpu5 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
- <&cpu6 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
- <&cpu7 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
- };
- };
- };
-
- cpu6-thermal {
- polling-delay = <1000>;
- polling-delay-passive = <250>;
- thermal-sensors = <&lvts_mcu MT8195_MCU_BIG_CPU2>;
-
- trips {
- cpu6_alert: trip-alert {
- temperature = <85000>;
- hysteresis = <2000>;
- type = "passive";
- };
-
- cpu6_crit: trip-crit {
- temperature = <100000>;
- hysteresis = <2000>;
- type = "critical";
- };
- };
-
- cooling-maps {
- map0 {
- trip = <&cpu6_alert>;
- cooling-device = <&cpu4 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
- <&cpu5 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
- <&cpu6 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
- <&cpu7 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
- };
- };
- };
-
- cpu7-thermal {
- polling-delay = <1000>;
- polling-delay-passive = <250>;
- thermal-sensors = <&lvts_mcu MT8195_MCU_BIG_CPU3>;
-
- trips {
- cpu7_alert: trip-alert {
- temperature = <85000>;
- hysteresis = <2000>;
- type = "passive";
- };
-
- cpu7_crit: trip-crit {
- temperature = <100000>;
+ cpu_big_soc_max_crit: soc-max-crit {
+ temperature = <115000>;
hysteresis = <2000>;
type = "critical";
};
@@ -3831,7 +3673,7 @@ cpu7_crit: trip-crit {

cooling-maps {
map0 {
- trip = <&cpu7_alert>;
+ trip = <&cpu_big_target>;
cooling-device = <&cpu4 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
<&cpu5 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
<&cpu6 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
--
2.44.1


2024-05-24 14:34:10

by Alexandre Bailon

[permalink] [raw]
Subject: [PATCH v3 4/6] dt-bindings: thermal: Add a property to select the aggregation type

This adds a new property named "aggregation" that could be used to
select the aggregation type when there are multiple sensors assigned
to a thermal zone.

Signed-off-by: Alexandre Bailon <[email protected]>
---
.../devicetree/bindings/thermal/thermal-zones.yaml | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
index fa7a72e2ba44..e6e4b46773e3 100644
--- a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
+++ b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
@@ -111,6 +111,14 @@ patternProperties:
coefficients are ordered and are matched with sensors by means of the
sensor ID. Additional coefficients are interpreted as constant offset.

+ aggregation:
+ $ref: /schemas/types.yaml#/definitions/string
+ enum:
+ - avg
+ - max
+ description:
+ Aggregation type to use compute a temperature from multiple sensors.
+
sustainable-power:
$ref: /schemas/types.yaml#/definitions/uint32
description:
--
2.44.1


2024-05-27 06:56:34

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v3 4/6] dt-bindings: thermal: Add a property to select the aggregation type

On 24/05/2024 16:31, Alexandre Bailon wrote:
> This adds a new property named "aggregation" that could be used to
> select the aggregation type when there are multiple sensors assigned
> to a thermal zone.
>
> Signed-off-by: Alexandre Bailon <[email protected]>
> ---
> .../devicetree/bindings/thermal/thermal-zones.yaml | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
> index fa7a72e2ba44..e6e4b46773e3 100644
> --- a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
> +++ b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
> @@ -111,6 +111,14 @@ patternProperties:
> coefficients are ordered and are matched with sensors by means of the
> sensor ID. Additional coefficients are interpreted as constant offset.
>
> + aggregation:
> + $ref: /schemas/types.yaml#/definitions/string
> + enum:
> + - avg
> + - max
> + description:
> + Aggregation type to use compute a temperature from multiple sensors.

Hm, why exactly this is a property of hardware? Why on one board you
would like to average and on other use max? I can only think of a case
that some sensors give inaccurate data. Otherwise it is OS policy.

Best regards,
Krzysztof


2024-05-27 06:57:08

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v3 6/6] ARM64: mt8195: Use thermal aggregation for big and little cpu

On 24/05/2024 16:31, Alexandre Bailon wrote:
> This uses the thermal aggregation for the mt8195 to get the maximal
> temperature of big and little cpu clusters.
>
> Signed-off-by: Alexandre Bailon <[email protected]>
> ---
> arch/arm64/boot/dts/mediatek/mt8195.dtsi | 212 +++--------------------
> 1 file changed, 27 insertions(+), 185 deletions(-)
>
> diff --git a/arch/arm64/boot/dts/mediatek/mt8195.dtsi b/arch/arm64/boot/dts/mediatek/mt8195.dtsi
> index 5d8b68f86ce4..8aa2bf142622 100644
> --- a/arch/arm64/boot/dts/mediatek/mt8195.dtsi
> +++ b/arch/arm64/boot/dts/mediatek/mt8195.dtsi
> @@ -3600,50 +3600,31 @@ dp_tx: dp-tx@1c600000 {
> };
>
> thermal_zones: thermal-zones {
> - cpu0-thermal {
> + cpu-little {

How is it related to the commit?

Does not look tested. You just introduced warnings.

Best regards,
Krzysztof


2024-05-27 07:00:58

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v3 2/6] thermal: Add support of multi sensors to thermal_core

On 24/05/2024 16:31, Alexandre Bailon wrote:
> This adds support of multi sensors to thermal.
> Currently, this only support the get_temp operation.
> This returns an average temperature of all the sensors.
> If defined, a coefficient is applied to the value read from the sensor
> before computing the average.
>
> Signed-off-by: Alexandre Bailon <[email protected]>
> ---
> drivers/thermal/Makefile | 1 +
> drivers/thermal/thermal_core.h | 15 ++
> drivers/thermal/thermal_multi.c | 332 ++++++++++++++++++++++++++++++++
> include/uapi/linux/thermal.h | 5 +
> 4 files changed, 353 insertions(+)
> create mode 100644 drivers/thermal/thermal_multi.c

This does not really build...


./drivers/thermal/thermal_multi.c:249:38: error: initialization of ‘struct thermal_trip *’ from incompatible pointer type ‘struct thermal_trip_desc *’ [-Werror=incompatible-pointer-types]

Best regards,
Krzysztof


2024-05-27 07:02:38

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v3 2/6] thermal: Add support of multi sensors to thermal_core

On 27/05/2024 09:00, Krzysztof Kozlowski wrote:
> On 24/05/2024 16:31, Alexandre Bailon wrote:
>> This adds support of multi sensors to thermal.
>> Currently, this only support the get_temp operation.
>> This returns an average temperature of all the sensors.
>> If defined, a coefficient is applied to the value read from the sensor
>> before computing the average.
>>
>> Signed-off-by: Alexandre Bailon <[email protected]>
>> ---
>> drivers/thermal/Makefile | 1 +
>> drivers/thermal/thermal_core.h | 15 ++
>> drivers/thermal/thermal_multi.c | 332 ++++++++++++++++++++++++++++++++
>> include/uapi/linux/thermal.h | 5 +
>> 4 files changed, 353 insertions(+)
>> create mode 100644 drivers/thermal/thermal_multi.c
>
> This does not really build...
>
>
> ../drivers/thermal/thermal_multi.c:249:38: error: initialization of ‘struct thermal_trip *’ from incompatible pointer type ‘struct thermal_trip_desc *’ [-Werror=incompatible-pointer-types]

and there are other warnings:

warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst


Best regards,
Krzysztof


2024-05-28 17:00:11

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: [PATCH v3 4/6] dt-bindings: thermal: Add a property to select the aggregation type

On Fri, May 24, 2024 at 04:31:48PM +0200, Alexandre Bailon wrote:
> This adds a new property named "aggregation" that could be used to
> select the aggregation type when there are multiple sensors assigned
> to a thermal zone.
>
> Signed-off-by: Alexandre Bailon <[email protected]>
> ---
> .../devicetree/bindings/thermal/thermal-zones.yaml | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
> index fa7a72e2ba44..e6e4b46773e3 100644
> --- a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
> +++ b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
> @@ -111,6 +111,14 @@ patternProperties:
> coefficients are ordered and are matched with sensors by means of the
> sensor ID. Additional coefficients are interpreted as constant offset.
>
> + aggregation:
> + $ref: /schemas/types.yaml#/definitions/string
> + enum:
> + - avg
> + - max
> + description:
> + Aggregation type to use compute a temperature from multiple sensors.

This is optional, so what's the default? I suppose you could make it
required if more than 1 sensor.

A boolean could work here as well unless you think there might be a 3rd
algorithm.

> +
> sustainable-power:
> $ref: /schemas/types.yaml#/definitions/uint32
> description:
> --
> 2.44.1
>

2024-05-29 08:15:25

by Alexandre Bailon

[permalink] [raw]
Subject: Re: [PATCH v3 4/6] dt-bindings: thermal: Add a property to select the aggregation type



On 5/27/24 08:55, Krzysztof Kozlowski wrote:
> On 24/05/2024 16:31, Alexandre Bailon wrote:
>> This adds a new property named "aggregation" that could be used to
>> select the aggregation type when there are multiple sensors assigned
>> to a thermal zone.
>>
>> Signed-off-by: Alexandre Bailon <[email protected]>
>> ---
>> .../devicetree/bindings/thermal/thermal-zones.yaml | 8 ++++++++
>> 1 file changed, 8 insertions(+)
>>
>> diff --git a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
>> index fa7a72e2ba44..e6e4b46773e3 100644
>> --- a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
>> +++ b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
>> @@ -111,6 +111,14 @@ patternProperties:
>> coefficients are ordered and are matched with sensors by means of the
>> sensor ID. Additional coefficients are interpreted as constant offset.
>>
>> + aggregation:
>> + $ref: /schemas/types.yaml#/definitions/string
>> + enum:
>> + - avg
>> + - max
>> + description:
>> + Aggregation type to use compute a temperature from multiple sensors.
>
> Hm, why exactly this is a property of hardware? Why on one board you
> would like to average and on other use max? I can only think of a case
> that some sensors give inaccurate data. Otherwise it is OS policyIn the case of Mediatek SoC, we only need max temperature.
I am not really sure if there is really an use case for the average.
Maybe I should drop average if no-one has a use case for it.

Best Regards,
Alexandre
>
> Best regards,
> Krzysztof
>

2024-05-29 08:25:11

by Alexandre Bailon

[permalink] [raw]
Subject: Re: [PATCH v3 6/6] ARM64: mt8195: Use thermal aggregation for big and little cpu



On 5/27/24 08:56, Krzysztof Kozlowski wrote:
> On 24/05/2024 16:31, Alexandre Bailon wrote:
>> This uses the thermal aggregation for the mt8195 to get the maximal
>> temperature of big and little cpu clusters.
>>
>> Signed-off-by: Alexandre Bailon <[email protected]>
>> ---
>> arch/arm64/boot/dts/mediatek/mt8195.dtsi | 212 +++--------------------
>> 1 file changed, 27 insertions(+), 185 deletions(-)
>>
>> diff --git a/arch/arm64/boot/dts/mediatek/mt8195.dtsi b/arch/arm64/boot/dts/mediatek/mt8195.dtsi
>> index 5d8b68f86ce4..8aa2bf142622 100644
>> --- a/arch/arm64/boot/dts/mediatek/mt8195.dtsi
>> +++ b/arch/arm64/boot/dts/mediatek/mt8195.dtsi
>> @@ -3600,50 +3600,31 @@ dp_tx: dp-tx@1c600000 {
>> };
>>
>> thermal_zones: thermal-zones {
>> - cpu0-thermal {
>> + cpu-little {
>
> How is it related to the commit?
To aggregate all thermal sensors which are in same performance domain,
we removes each individual nodes to create only two: cpu-little and cpu-big.
We need to remove the other nodes because their trips points and cooling
devices would conflict with those we define for the multi sensor thermal
zone.

Best Regards,
Alexandre
>
> Does not look tested. You just introduced warnings.
>
> Best regards,
> Krzysztof
>

2024-05-29 08:25:57

by Alexandre Bailon

[permalink] [raw]
Subject: Re: [PATCH v3 2/6] thermal: Add support of multi sensors to thermal_core



On 5/27/24 09:00, Krzysztof Kozlowski wrote:
> On 24/05/2024 16:31, Alexandre Bailon wrote:
>> This adds support of multi sensors to thermal.
>> Currently, this only support the get_temp operation.
>> This returns an average temperature of all the sensors.
>> If defined, a coefficient is applied to the value read from the sensor
>> before computing the average.
>>
>> Signed-off-by: Alexandre Bailon <[email protected]>
>> ---
>> drivers/thermal/Makefile | 1 +
>> drivers/thermal/thermal_core.h | 15 ++
>> drivers/thermal/thermal_multi.c | 332 ++++++++++++++++++++++++++++++++
>> include/uapi/linux/thermal.h | 5 +
>> 4 files changed, 353 insertions(+)
>> create mode 100644 drivers/thermal/thermal_multi.c
>
> This does not really build...
Sorry for that. I have not rebased and tested my patches on the latest
master commit. I will be more careful for the V4.

Best regards,
Alexandre
>
>
> ../drivers/thermal/thermal_multi.c:249:38: error: initialization of ‘struct thermal_trip *’ from incompatible pointer type ‘struct thermal_trip_desc *’ [-Werror=incompatible-pointer-types]
>
> Best regards,
> Krzysztof
>

2024-05-29 12:39:31

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v3 6/6] ARM64: mt8195: Use thermal aggregation for big and little cpu

On 29/05/2024 10:19, Alexandre Bailon wrote:
>
>
> On 5/27/24 08:56, Krzysztof Kozlowski wrote:
>> On 24/05/2024 16:31, Alexandre Bailon wrote:
>>> This uses the thermal aggregation for the mt8195 to get the maximal
>>> temperature of big and little cpu clusters.
>>>
>>> Signed-off-by: Alexandre Bailon <[email protected]>
>>> ---
>>> arch/arm64/boot/dts/mediatek/mt8195.dtsi | 212 +++--------------------
>>> 1 file changed, 27 insertions(+), 185 deletions(-)
>>>
>>> diff --git a/arch/arm64/boot/dts/mediatek/mt8195.dtsi b/arch/arm64/boot/dts/mediatek/mt8195.dtsi
>>> index 5d8b68f86ce4..8aa2bf142622 100644
>>> --- a/arch/arm64/boot/dts/mediatek/mt8195.dtsi
>>> +++ b/arch/arm64/boot/dts/mediatek/mt8195.dtsi
>>> @@ -3600,50 +3600,31 @@ dp_tx: dp-tx@1c600000 {
>>> };
>>>
>>> thermal_zones: thermal-zones {
>>> - cpu0-thermal {
>>> + cpu-little {
>>
>> How is it related to the commit?
> To aggregate all thermal sensors which are in same performance domain,
> we removes each individual nodes to create only two: cpu-little and cpu-big.

OK...

> We need to remove the other nodes because their trips points and cooling
> devices would conflict with those we define for the multi sensor thermal
> zone.
>
> Best Regards,
> Alexandre
>>
>> Does not look tested. You just introduced warnings.

Yet still I claim it wasn't tested. See SoC maintainer profile documents
and/or writing-schema.

Best regards,
Krzysztof