2023-07-03 17:45:07

by Nícolas F. R. A. Prado

[permalink] [raw]
Subject: [PATCH] thermal/core: Don't update trip points inside the hysteresis range

When searching for the trip points that need to be set, the nearest trip
point's temperature is used for the high trip, while the nearest trip
point's temperature minus the hysteresis is used for the low trip. The
issue with this logic is that when the current temperature is inside a
trip point's hysteresis range, both high and low trips will come from
the same trip point. As a consequence instability can still occur like
this:
* the temperature rises slightly and enters the hysteresis range of a
trip point
* polling happens and updates the trip points to the hysteresis range
* the temperature falls slightly, exiting the hysteresis range, crossing
the trip point and triggering an IRQ, the trip points are updated
* repeat

So even though the current hysteresis implementation prevents
instability from happening due to IRQs triggering on the same
temperature value, both ways, it doesn't prevent it from happening due
to an IRQ on one way and polling on the other.

To properly implement a hysteresis behavior, when inside the hysteresis
range, don't update the trip points. This way, the previously set trip
points will stay in effect, which will in a way remember the previous
state (if the temperature signal came from above or below the range) and
therefore have the right trip point already set. The exception is if
there was no previous trip point set, in which case a previous state
doesn't exist, and so it's sensible to allow the hysteresis range as
trip points.

Signed-off-by: Nícolas F. R. A. Prado <[email protected]>

---

drivers/thermal/thermal_trip.c | 21 +++++++++++++++++++--
1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/drivers/thermal/thermal_trip.c b/drivers/thermal/thermal_trip.c
index 907f3a4d7bc8..c386ac5d8bad 100644
--- a/drivers/thermal/thermal_trip.c
+++ b/drivers/thermal/thermal_trip.c
@@ -57,6 +57,7 @@ void __thermal_zone_set_trips(struct thermal_zone_device *tz)
{
struct thermal_trip trip;
int low = -INT_MAX, high = INT_MAX;
+ int low_trip_id = -1, high_trip_id = -2;
int i, ret;

lockdep_assert_held(&tz->lock);
@@ -73,18 +74,34 @@ void __thermal_zone_set_trips(struct thermal_zone_device *tz)

trip_low = trip.temperature - trip.hysteresis;

- if (trip_low < tz->temperature && trip_low > low)
+ if (trip_low < tz->temperature && trip_low > low) {
low = trip_low;
+ low_trip_id = i;
+ }

if (trip.temperature > tz->temperature &&
- trip.temperature < high)
+ trip.temperature < high) {
high = trip.temperature;
+ high_trip_id = i;
+ }
}

/* No need to change trip points */
if (tz->prev_low_trip == low && tz->prev_high_trip == high)
return;

+ /*
+ * If the current temperature is inside a trip point's hysteresis range,
+ * don't update the trip points, rely on the previously set ones to
+ * rememember the previous state.
+ *
+ * Unless no previous trip point was set, in which case there's no
+ * previous state to remember.
+ */
+ if ((tz->prev_low_trip > -INT_MAX || tz->prev_high_trip < INT_MAX) &&
+ low_trip_id == high_trip_id)
+ return;
+
tz->prev_low_trip = low;
tz->prev_high_trip = high;

--
2.41.0



2023-08-21 23:31:18

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] thermal/core: Don't update trip points inside the hysteresis range

On Mon, Jul 3, 2023 at 7:15 PM Nícolas F. R. A. Prado
<[email protected]> wrote:
>
> When searching for the trip points that need to be set, the nearest trip
> point's temperature is used for the high trip, while the nearest trip
> point's temperature minus the hysteresis is used for the low trip. The
> issue with this logic is that when the current temperature is inside a
> trip point's hysteresis range, both high and low trips will come from
> the same trip point. As a consequence instability can still occur like
> this:
> * the temperature rises slightly and enters the hysteresis range of a
> trip point
> * polling happens and updates the trip points to the hysteresis range
> * the temperature falls slightly, exiting the hysteresis range, crossing
> the trip point and triggering an IRQ, the trip points are updated
> * repeat
>
> So even though the current hysteresis implementation prevents
> instability from happening due to IRQs triggering on the same
> temperature value, both ways, it doesn't prevent it from happening due
> to an IRQ on one way and polling on the other.
>
> To properly implement a hysteresis behavior, when inside the hysteresis
> range, don't update the trip points. This way, the previously set trip
> points will stay in effect, which will in a way remember the previous
> state (if the temperature signal came from above or below the range) and
> therefore have the right trip point already set. The exception is if
> there was no previous trip point set, in which case a previous state
> doesn't exist, and so it's sensible to allow the hysteresis range as
> trip points.
>
> Signed-off-by: Nícolas F. R. A. Prado <[email protected]>
>
> ---
>
> drivers/thermal/thermal_trip.c | 21 +++++++++++++++++++--
> 1 file changed, 19 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/thermal/thermal_trip.c b/drivers/thermal/thermal_trip.c
> index 907f3a4d7bc8..c386ac5d8bad 100644
> --- a/drivers/thermal/thermal_trip.c
> +++ b/drivers/thermal/thermal_trip.c
> @@ -57,6 +57,7 @@ void __thermal_zone_set_trips(struct thermal_zone_device *tz)
> {
> struct thermal_trip trip;
> int low = -INT_MAX, high = INT_MAX;
> + int low_trip_id = -1, high_trip_id = -2;
> int i, ret;
>
> lockdep_assert_held(&tz->lock);
> @@ -73,18 +74,34 @@ void __thermal_zone_set_trips(struct thermal_zone_device *tz)
>
> trip_low = trip.temperature - trip.hysteresis;
>
> - if (trip_low < tz->temperature && trip_low > low)
> + if (trip_low < tz->temperature && trip_low > low) {
> low = trip_low;
> + low_trip_id = i;
> + }
>

I think I get the idea, but wouldn't a similar effect be achieved by
adding an "else" here?

> if (trip.temperature > tz->temperature &&
> - trip.temperature < high)
> + trip.temperature < high) {
> high = trip.temperature;
> + high_trip_id = i;
> + }
> }
>
> /* No need to change trip points */
> if (tz->prev_low_trip == low && tz->prev_high_trip == high)
> return;
>
> + /*
> + * If the current temperature is inside a trip point's hysteresis range,
> + * don't update the trip points, rely on the previously set ones to
> + * rememember the previous state.
> + *
> + * Unless no previous trip point was set, in which case there's no
> + * previous state to remember.
> + */
> + if ((tz->prev_low_trip > -INT_MAX || tz->prev_high_trip < INT_MAX) &&
> + low_trip_id == high_trip_id)
> + return;
> +
> tz->prev_low_trip = low;
> tz->prev_high_trip = high;
>
> --

2023-08-22 13:13:55

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] thermal/core: Don't update trip points inside the hysteresis range

On Tue, Aug 22, 2023 at 12:25 AM Nícolas F. R. A. Prado
<[email protected]> wrote:
>
> On Mon, Aug 21, 2023 at 11:10:27PM +0200, Rafael J. Wysocki wrote:
> > On Mon, Jul 3, 2023 at 7:15 PM Nícolas F. R. A. Prado
> > <[email protected]> wrote:
> > >
> > > When searching for the trip points that need to be set, the nearest trip
> > > point's temperature is used for the high trip, while the nearest trip
> > > point's temperature minus the hysteresis is used for the low trip. The
> > > issue with this logic is that when the current temperature is inside a
> > > trip point's hysteresis range, both high and low trips will come from
> > > the same trip point. As a consequence instability can still occur like
> > > this:
> > > * the temperature rises slightly and enters the hysteresis range of a
> > > trip point
> > > * polling happens and updates the trip points to the hysteresis range
> > > * the temperature falls slightly, exiting the hysteresis range, crossing
> > > the trip point and triggering an IRQ, the trip points are updated
> > > * repeat
> > >
> > > So even though the current hysteresis implementation prevents
> > > instability from happening due to IRQs triggering on the same
> > > temperature value, both ways, it doesn't prevent it from happening due
> > > to an IRQ on one way and polling on the other.
> > >
> > > To properly implement a hysteresis behavior, when inside the hysteresis
> > > range, don't update the trip points. This way, the previously set trip
> > > points will stay in effect, which will in a way remember the previous
> > > state (if the temperature signal came from above or below the range) and
> > > therefore have the right trip point already set. The exception is if
> > > there was no previous trip point set, in which case a previous state
> > > doesn't exist, and so it's sensible to allow the hysteresis range as
> > > trip points.
> > >
> > > Signed-off-by: Nícolas F. R. A. Prado <[email protected]>
> > >
> > > ---
> > >
> > > drivers/thermal/thermal_trip.c | 21 +++++++++++++++++++--
> > > 1 file changed, 19 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/thermal/thermal_trip.c b/drivers/thermal/thermal_trip.c
> > > index 907f3a4d7bc8..c386ac5d8bad 100644
> > > --- a/drivers/thermal/thermal_trip.c
> > > +++ b/drivers/thermal/thermal_trip.c
> > > @@ -57,6 +57,7 @@ void __thermal_zone_set_trips(struct thermal_zone_device *tz)
> > > {
> > > struct thermal_trip trip;
> > > int low = -INT_MAX, high = INT_MAX;
> > > + int low_trip_id = -1, high_trip_id = -2;
> > > int i, ret;
> > >
> > > lockdep_assert_held(&tz->lock);
> > > @@ -73,18 +74,34 @@ void __thermal_zone_set_trips(struct thermal_zone_device *tz)
> > >
> > > trip_low = trip.temperature - trip.hysteresis;
> > >
> > > - if (trip_low < tz->temperature && trip_low > low)
> > > + if (trip_low < tz->temperature && trip_low > low) {
> > > low = trip_low;
> > > + low_trip_id = i;
> > > + }
> > >
> >
> > I think I get the idea, but wouldn't a similar effect be achieved by
> > adding an "else" here?
>
> No. That would only fix the problem in one direction, namely, when the
> temperature entered the hysteresis range from above. But when the temperature
> entered the range from below, we'd need to check the high threshold first to
> achieve the same result.
>
> The way I've implemented here is the simplest I could think of that works for
> both directions.

Well, what about the replacement patch below (untested)?

---
drivers/thermal/thermal_trip.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)

Index: linux-pm/drivers/thermal/thermal_trip.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_trip.c
+++ linux-pm/drivers/thermal/thermal_trip.c
@@ -55,6 +55,7 @@ void __thermal_zone_set_trips(struct the
{
struct thermal_trip trip;
int low = -INT_MAX, high = INT_MAX;
+ bool same_trip = false;
int i, ret;

lockdep_assert_held(&tz->lock);
@@ -63,6 +64,7 @@ void __thermal_zone_set_trips(struct the
return;

for (i = 0; i < tz->num_trips; i++) {
+ bool low_set = false;
int trip_low;

ret = __thermal_zone_get_trip(tz, i , &trip);
@@ -71,18 +73,31 @@ void __thermal_zone_set_trips(struct the

trip_low = trip.temperature - trip.hysteresis;

- if (trip_low < tz->temperature && trip_low > low)
+ if (trip_low < tz->temperature && trip_low > low) {
low = trip_low;
+ low_set = true;
+ same_trip = false;
+ }

if (trip.temperature > tz->temperature &&
- trip.temperature < high)
+ trip.temperature < high) {
high = trip.temperature;
+ same_trip = low_set;
+ }
}

/* No need to change trip points */
if (tz->prev_low_trip == low && tz->prev_high_trip == high)
return;

+ /*
+ * If "high" and "low" are the same, skip the change unless this is the
+ * first time.
+ */
+ if (same_trip && (tz->prev_low_trip != -INT_MAX ||
+ tz->prev_high_trip != INT_MAX))
+ return;
+
tz->prev_low_trip = low;
tz->prev_high_trip = high;