2024-05-09 19:18:10

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH v1 0/7] thermal/debugfs: Assorted improvements for the 6.11 cycle

Hi Everyone,

This series is for the 6.11 cycle, but since it is ready from my POV,
here it goes in case people have the time to look at it in the meantime.

The patches in the series address some minor issues in the thermal
debugfs code and clean it up somewhat.

Please refer to the individual patch changelogs for details.

At one point I'm going to put this series on a separate git branch
for easier access/testing.

Thanks!





2024-05-09 19:18:23

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH v1 6/7] thermal/debugfs: Move some statements from under thermal_dbg->lock

From: Rafael J. Wysocki <[email protected]>

The tz_dbg local variable assignments in thermal_debug_tz_trip_up(),
thermal_debug_tz_trip_down(), and thermal_debug_update_trip_stats()
need not be carried out under thermal_dbg->lock, so move them from
under that lock (to avoid possible future confusion).

While at it, reorder local variable definitions in
thermal_debug_tz_trip_up() for more clarity.

No functional impact.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/thermal/thermal_debugfs.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)

Index: linux-pm/drivers/thermal/thermal_debugfs.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_debugfs.c
+++ linux-pm/drivers/thermal/thermal_debugfs.c
@@ -568,19 +568,19 @@ static struct tz_episode *thermal_debugf
void thermal_debug_tz_trip_up(struct thermal_zone_device *tz,
const struct thermal_trip *trip)
{
- struct tz_episode *tze;
- struct tz_debugfs *tz_dbg;
struct thermal_debugfs *thermal_dbg = tz->debugfs;
int trip_id = thermal_zone_trip_id(tz, trip);
ktime_t now = ktime_get();
+ struct tz_debugfs *tz_dbg;
+ struct tz_episode *tze;

if (!thermal_dbg)
return;

- mutex_lock(&thermal_dbg->lock);
-
tz_dbg = &thermal_dbg->tz_dbg;

+ mutex_lock(&thermal_dbg->lock);
+
/*
* The mitigation is starting. A mitigation can contain
* several episodes where each of them is related to a
@@ -667,10 +667,10 @@ void thermal_debug_tz_trip_down(struct t
if (!thermal_dbg)
return;

- mutex_lock(&thermal_dbg->lock);
-
tz_dbg = &thermal_dbg->tz_dbg;

+ mutex_lock(&thermal_dbg->lock);
+
/*
* The temperature crosses the way down but there was not
* mitigation detected before. That may happen when the
@@ -719,10 +719,10 @@ void thermal_debug_update_trip_stats(str
if (!thermal_dbg)
return;

- mutex_lock(&thermal_dbg->lock);
-
tz_dbg = &thermal_dbg->tz_dbg;

+ mutex_lock(&thermal_dbg->lock);
+
if (!tz_dbg->nr_trips)
goto out;





2024-05-09 19:18:33

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH v1 4/7] thermal/debugfs: Fix up units in "mitigations" files

From: Rafael J. Wysocki <[email protected]>

Print temperature units as m°C rather than °mC (the meaning of which is
unclear) and add time unit to the duration column.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/thermal/thermal_debugfs.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

Index: linux-pm/drivers/thermal/thermal_debugfs.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_debugfs.c
+++ linux-pm/drivers/thermal/thermal_debugfs.c
@@ -792,7 +792,7 @@ static int tze_seq_show(struct seq_file
seq_printf(s, ",-Mitigation at %llums, duration%c%llums\n",
ktime_to_ms(tze->timestamp), c, duration_ms);

- seq_printf(s, "| trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) |\n");
+ seq_printf(s, "| trip | type | temp(m°C) | hyst(m°C) | duration(ms) | avg(m°C) | min(m°C) | max(m°C) |\n");

for_each_trip_desc(tz, td) {
const struct thermal_trip *trip = &td->trip;
@@ -842,7 +842,7 @@ static int tze_seq_show(struct seq_file
8, type,
9, trip->temperature,
9, trip->hysteresis,
- c, 10, duration_ms,
+ c, 11, duration_ms,
9, trip_stats->avg,
9, trip_stats->min,
9, trip_stats->max);




2024-05-09 19:18:59

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH v1 5/7] thermal/debugfs: Compute maximum temperature for mitigation episode as a whole

From: Rafael J. Wysocki <[email protected]>

Notice that the maximum temperature above the trip point must be the
same for all of the trip points involved in a given mitigation episode,
so it need not be computerd for each of them separately.

It is sufficient to compute the maximum temperature for the mitigation
episode as a whole and print it accordingly, so do that.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/thermal/thermal_debugfs.c | 30 +++++++++++++++---------------
1 file changed, 15 insertions(+), 15 deletions(-)

Index: linux-pm/drivers/thermal/thermal_debugfs.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_debugfs.c
+++ linux-pm/drivers/thermal/thermal_debugfs.c
@@ -92,7 +92,6 @@ struct cdev_record {
* @timestamp: the trip crossing timestamp
* @duration: total time when the zone temperature was above the trip point
* @count: the number of times the zone temperature was above the trip point
- * @max: maximum recorded temperature above the trip point
* @min: minimum recorded temperature above the trip point
* @avg: average temperature above the trip point
*/
@@ -100,7 +99,6 @@ struct trip_stats {
ktime_t timestamp;
ktime_t duration;
int count;
- int max;
int min;
int avg;
};
@@ -115,15 +113,17 @@ struct trip_stats {
* the way up and down if there are multiple trip described in the
* firmware after the lowest temperature trip point.
*
+ * @node: a list element to be added to the list of tz events
* @timestamp: first trip point crossed the way up
* @duration: total duration of the mitigation episode
- * @node: a list element to be added to the list of tz events
+ * @max_temp: maximum zone temperature during this episode
* @trip_stats: per trip point statistics, flexible array
*/
struct tz_episode {
+ struct list_head node;
ktime_t timestamp;
ktime_t duration;
- struct list_head node;
+ int max_temp;
struct trip_stats trip_stats[];
};

@@ -557,11 +557,10 @@ static struct tz_episode *thermal_debugf
INIT_LIST_HEAD(&tze->node);
tze->timestamp = now;
tze->duration = KTIME_MIN;
+ tze->max_temp = THERMAL_TEMP_INVALID;

- for (i = 0; i < tz->num_trips; i++) {
+ for (i = 0; i < tz->num_trips; i++)
tze->trip_stats[i].min = INT_MAX;
- tze->trip_stats[i].max = INT_MIN;
- }

return tze;
}
@@ -729,11 +728,13 @@ void thermal_debug_update_trip_stats(str

tze = list_first_entry(&tz_dbg->tz_episodes, struct tz_episode, node);

+ if (tz->temperature > tze->max_temp)
+ tze->max_temp = tz->temperature;
+
for (i = 0; i < tz_dbg->nr_trips; i++) {
int trip_id = tz_dbg->trips_crossed[i];
struct trip_stats *trip_stats = &tze->trip_stats[trip_id];

- trip_stats->max = max(trip_stats->max, tz->temperature);
trip_stats->min = min(trip_stats->min, tz->temperature);
trip_stats->avg += (tz->temperature - trip_stats->avg) /
++trip_stats->count;
@@ -789,10 +790,10 @@ static int tze_seq_show(struct seq_file
c = '=';
}

- seq_printf(s, ",-Mitigation at %llums, duration%c%llums\n",
- ktime_to_ms(tze->timestamp), c, duration_ms);
+ seq_printf(s, ",-Mitigation at %llums, duration%c%llums, max. temp=%dm°C\n",
+ ktime_to_ms(tze->timestamp), c, duration_ms, tze->max_temp);

- seq_printf(s, "| trip | type | temp(m°C) | hyst(m°C) | duration(ms) | avg(m°C) | min(m°C) | max(m°C) |\n");
+ seq_printf(s, "| trip | type | temp(m°C) | hyst(m°C) | duration(ms) | avg(m°C) | min(m°C) |\n");

for_each_trip_desc(tz, td) {
const struct thermal_trip *trip = &td->trip;
@@ -814,7 +815,7 @@ static int tze_seq_show(struct seq_file
trip_stats = &tze->trip_stats[trip_id];

/* Skip trips without any stats. */
- if (trip_stats->min > trip_stats->max)
+ if (trip_stats->min == INT_MAX)
continue;

if (trip->type == THERMAL_TRIP_PASSIVE)
@@ -837,15 +838,14 @@ static int tze_seq_show(struct seq_file
c = ' ';
}

- seq_printf(s, "| %*d | %*s | %*d | %*d | %c%*lld | %*d | %*d | %*d |\n",
+ seq_printf(s, "| %*d | %*s | %*d | %*d | %c%*lld | %*d | %*d |\n",
4 , trip_id,
8, type,
9, trip->temperature,
9, trip->hysteresis,
c, 11, duration_ms,
9, trip_stats->avg,
- 9, trip_stats->min,
- 9, trip_stats->max);
+ 9, trip_stats->min);
}

return 0;




2024-05-09 19:19:16

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH v1 3/7] thermal/debugfs: Print mitigation timestamp value in milliseconds

From: Rafael J. Wysocki <[email protected]>

Because mitigation episode duration is printed in milliseconds, there
is no reason to print timestamp information for mitigation episodes in
smaller units which also makes it somewhat harder to interpret the
numbers.

Print it in milliseconds for consistency.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/thermal/thermal_debugfs.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

Index: linux-pm/drivers/thermal/thermal_debugfs.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_debugfs.c
+++ linux-pm/drivers/thermal/thermal_debugfs.c
@@ -789,8 +789,8 @@ static int tze_seq_show(struct seq_file
c = '=';
}

- seq_printf(s, ",-Mitigation at %lluus, duration%c%llums\n",
- ktime_to_us(tze->timestamp), c, duration_ms);
+ seq_printf(s, ",-Mitigation at %llums, duration%c%llums\n",
+ ktime_to_ms(tze->timestamp), c, duration_ms);

seq_printf(s, "| trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) |\n");





2024-05-09 19:19:36

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH v1 2/7] thermal/debugfs: Do not extend mitigation episodes beyond system resume

From: Rafael J. Wysocki <[email protected]>

Because thermal zone handling by the thermal core is started from
scratch during resume from system-wide suspend, prevent the debug
code from extending mitigation episodes beyond that point by ending
the mitigation episode currently in progress, if any, for each thermal
zone.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/thermal/thermal_core.c | 1 +
drivers/thermal/thermal_debugfs.c | 36 ++++++++++++++++++++++++++++++++++++
drivers/thermal/thermal_debugfs.h | 2 ++
3 files changed, 39 insertions(+)

Index: linux-pm/drivers/thermal/thermal_core.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_core.c
+++ linux-pm/drivers/thermal/thermal_core.c
@@ -1619,6 +1619,7 @@ static void thermal_zone_device_resume(s

tz->suspended = false;

+ thermal_debug_tz_resume(tz);
thermal_zone_device_init(tz);
__thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);

Index: linux-pm/drivers/thermal/thermal_debugfs.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_debugfs.c
+++ linux-pm/drivers/thermal/thermal_debugfs.c
@@ -922,3 +922,39 @@ void thermal_debug_tz_remove(struct ther
thermal_debugfs_remove_id(thermal_dbg);
kfree(trips_crossed);
}
+
+void thermal_debug_tz_resume(struct thermal_zone_device *tz)
+{
+ struct thermal_debugfs *thermal_dbg = tz->debugfs;
+ ktime_t now = ktime_get();
+ struct tz_debugfs *tz_dbg;
+ struct tz_episode *tze;
+ int i;
+
+ if (!thermal_dbg)
+ return;
+
+ mutex_lock(&thermal_dbg->lock);
+
+ tz_dbg = &thermal_dbg->tz_dbg;
+
+ if (!tz_dbg->nr_trips)
+ goto out;
+
+ /*
+ * A mitigation episode was in progress before the preceding system
+ * suspend transition, so close it because the zone handling is starting
+ * over from scratch.
+ */
+ tze = list_first_entry(&tz_dbg->tz_episodes, struct tz_episode, node);
+
+ for (i = 0; i < tz_dbg->nr_trips; i++)
+ tz_episode_close_trip(tze, tz_dbg->trips_crossed[i], now);
+
+ tze->duration = ktime_sub(now, tze->timestamp);
+
+ tz_dbg->nr_trips = 0;
+
+out:
+ mutex_unlock(&thermal_dbg->lock);
+}
Index: linux-pm/drivers/thermal/thermal_debugfs.h
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_debugfs.h
+++ linux-pm/drivers/thermal/thermal_debugfs.h
@@ -7,6 +7,7 @@ void thermal_debug_cdev_remove(struct th
void thermal_debug_cdev_state_update(const struct thermal_cooling_device *cdev, int state);
void thermal_debug_tz_add(struct thermal_zone_device *tz);
void thermal_debug_tz_remove(struct thermal_zone_device *tz);
+void thermal_debug_tz_resume(struct thermal_zone_device *tz);
void thermal_debug_tz_trip_up(struct thermal_zone_device *tz,
const struct thermal_trip *trip);
void thermal_debug_tz_trip_down(struct thermal_zone_device *tz,
@@ -20,6 +21,7 @@ static inline void thermal_debug_cdev_st
int state) {}
static inline void thermal_debug_tz_add(struct thermal_zone_device *tz) {}
static inline void thermal_debug_tz_remove(struct thermal_zone_device *tz) {}
+static inline void thermal_debug_tz_resume(struct thermal_zone_device *tz) {}
static inline void thermal_debug_tz_trip_up(struct thermal_zone_device *tz,
const struct thermal_trip *trip) {};
static inline void thermal_debug_tz_trip_down(struct thermal_zone_device *tz,




2024-05-09 19:20:06

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH v1 1/7] thermal/debugfs: Use helper to update trip point overstepping duration

From: Rafael J. Wysocki <[email protected]>

Add a helper for updating trip point overstepping duration to be called
from thermal_debug_tz_trip_down().

Subsequently, it will also be used during resume from system-wide
suspend.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/thermal/thermal_debugfs.c | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)

Index: linux-pm/drivers/thermal/thermal_debugfs.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_debugfs.c
+++ linux-pm/drivers/thermal/thermal_debugfs.c
@@ -645,14 +645,24 @@ unlock:
mutex_unlock(&thermal_dbg->lock);
}

+static void tz_episode_close_trip(struct tz_episode *tze, int trip_id, ktime_t now)
+{
+ struct trip_stats *trip_stats = &tze->trip_stats[trip_id];
+ ktime_t delta = ktime_sub(now, trip_stats->timestamp);
+
+ trip_stats->duration = ktime_add(delta, trip_stats->duration);
+ /* Mark the end of mitigation for this trip point. */
+ trip_stats->timestamp = KTIME_MAX;
+}
+
void thermal_debug_tz_trip_down(struct thermal_zone_device *tz,
const struct thermal_trip *trip)
{
struct thermal_debugfs *thermal_dbg = tz->debugfs;
+ int trip_id = thermal_zone_trip_id(tz, trip);
+ ktime_t now = ktime_get();
struct tz_episode *tze;
struct tz_debugfs *tz_dbg;
- ktime_t delta, now = ktime_get();
- int trip_id = thermal_zone_trip_id(tz, trip);
int i;

if (!thermal_dbg)
@@ -687,13 +697,7 @@ void thermal_debug_tz_trip_down(struct t

tze = list_first_entry(&tz_dbg->tz_episodes, struct tz_episode, node);

- delta = ktime_sub(now, tze->trip_stats[trip_id].timestamp);
-
- tze->trip_stats[trip_id].duration =
- ktime_add(delta, tze->trip_stats[trip_id].duration);
-
- /* Mark the end of mitigation for this trip point. */
- tze->trip_stats[trip_id].timestamp = KTIME_MAX;
+ tz_episode_close_trip(tze, trip_id, now);

/*
* This event closes the mitigation as we are crossing the