2014-07-30 23:06:23

by Matt Longnecker

[permalink] [raw]
Subject: [PATCH] thermal: tell cooling devices when a trip_point changes

Some hardware can react autonomously at a programmed temperature.
For example, an SoC might implement a last ditch throttle or a
hardware thermal shutdown. The driver for such a device can
register itself as a cooling_device with the thermal framework.

With this change, the thermal framework notifies such a driver
when userspace alters the relevant trip temperature so that
the driver can reprogram its hardware

Signed-off-by: Matt Longnecker <[email protected]>
---
drivers/thermal/thermal_core.c | 30 ++++++++++++++++++++++++++++++
include/linux/thermal.h | 2 ++
2 files changed, 32 insertions(+)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 71b0ec0..f25272e 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -597,6 +597,7 @@ trip_point_temp_store(struct device *dev, struct device_attribute *attr,
struct thermal_zone_device *tz = to_thermal_zone(dev);
int trip, ret;
unsigned long temperature;
+ struct thermal_instance *pos = NULL;

if (!tz->ops->set_trip_temp)
return -EPERM;
@@ -609,6 +610,20 @@ trip_point_temp_store(struct device *dev, struct device_attribute *attr,

ret = tz->ops->set_trip_temp(tz, trip, temperature);

+ /*
+ * Notify bound cooling devices that this trip point changed.
+ * This is useful for cooling devices which represent a behavior
+ * which trips in hardware (e.g. catastrophic shutdown)
+ */
+ list_for_each_entry(pos, &tz->thermal_instances, tz_node) {
+ if (pos->tz == tz && pos->trip == trip && pos->cdev) {
+ if (pos->cdev->ops->trip_point_changed)
+ pos->cdev->ops->trip_point_changed(pos->cdev,
+ pos->tz,
+ trip);
+ }
+ }
+
return ret ? ret : count;
}

@@ -641,6 +656,7 @@ trip_point_hyst_store(struct device *dev, struct device_attribute *attr,
struct thermal_zone_device *tz = to_thermal_zone(dev);
int trip, ret;
unsigned long temperature;
+ struct thermal_instance *pos = NULL;

if (!tz->ops->set_trip_hyst)
return -EPERM;
@@ -658,6 +674,20 @@ trip_point_hyst_store(struct device *dev, struct device_attribute *attr,
*/
ret = tz->ops->set_trip_hyst(tz, trip, temperature);

+ /*
+ * Notify bound cooling devices that this trip point changed.
+ * This is useful for cooling devices which represent a behavior
+ * which trips in hardware (e.g. catastrophic shutdown)
+ */
+ list_for_each_entry(pos, &tz->thermal_instances, tz_node) {
+ if (pos->tz == tz && pos->trip == trip && pos->cdev) {
+ if (pos->cdev->ops->trip_point_changed)
+ pos->cdev->ops->trip_point_changed(pos->cdev,
+ pos->tz,
+ trip);
+ }
+ }
+
return ret ? ret : count;
}

diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index f7e11c7..7da7fc5 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -138,6 +138,8 @@ struct thermal_cooling_device_ops {
int (*get_max_state) (struct thermal_cooling_device *, unsigned long *);
int (*get_cur_state) (struct thermal_cooling_device *, unsigned long *);
int (*set_cur_state) (struct thermal_cooling_device *, unsigned long);
+ void (*trip_point_changed) (struct thermal_cooling_device *,
+ struct thermal_zone_device *, int);
};

struct thermal_cooling_device {
--
1.7.9.5


2014-07-31 07:59:30

by Javi Merino

[permalink] [raw]
Subject: Re: [PATCH] thermal: tell cooling devices when a trip_point changes

On Thu, Jul 31, 2014 at 12:10:40AM +0100, Matt Longnecker wrote:
> Some hardware can react autonomously at a programmed temperature.
> For example, an SoC might implement a last ditch throttle or a
> hardware thermal shutdown. The driver for such a device can
> register itself as a cooling_device with the thermal framework.
>
> With this change, the thermal framework notifies such a driver
> when userspace alters the relevant trip temperature so that
> the driver can reprogram its hardware

Why can't you just use the existing cooling device interface? Cooling
devices can be bound to trip points. Most thermal governors will
increase cooling for that cooling device when the trip point is hit.
The last ditch throttle or hardware thermal shutdown will then kick
when the cooling state changes to 1.

If the existing governors are too complex for what you want, you can
have a look at the bang bang governor[0] which (I think) is bound to
be merged soon.

[0] http://article.gmane.org/gmane.linux.kernel/1753348

Cheers,
Javi

2014-07-31 08:31:04

by Zhang, Rui

[permalink] [raw]
Subject: Re: [PATCH] thermal: tell cooling devices when a trip_point changes

On Wed, 2014-07-30 at 16:10 -0700, Matt Longnecker wrote:
> Some hardware can react autonomously at a programmed temperature.

if you have a temperature sensor, then you should have a thermal zone
device driver for it, right?

> For example, an SoC might implement a last ditch throttle or a
> hardware thermal shutdown.

And it should be handled by the thermal zone driver, who has the
knowledge of when to do throttle/shutdown.

> The driver for such a device can
> register itself as a cooling_device with the thermal framework.
>
cooling device just exports its cooling ability for thermal framework to
use. It never makes any decision about thermal control.

> With this change, the thermal framework notifies such a driver
> when userspace alters the relevant trip temperature so that
> the driver can reprogram its hardware
>
When user space alters the trip temperature, the thermal zone device
driver is aware is this change, and it can optionally program the
hardware.

thanks,
rui
> Signed-off-by: Matt Longnecker <[email protected]>
> ---
> drivers/thermal/thermal_core.c | 30 ++++++++++++++++++++++++++++++
> include/linux/thermal.h | 2 ++
> 2 files changed, 32 insertions(+)
>
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index 71b0ec0..f25272e 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -597,6 +597,7 @@ trip_point_temp_store(struct device *dev, struct device_attribute *attr,
> struct thermal_zone_device *tz = to_thermal_zone(dev);
> int trip, ret;
> unsigned long temperature;
> + struct thermal_instance *pos = NULL;
>
> if (!tz->ops->set_trip_temp)
> return -EPERM;
> @@ -609,6 +610,20 @@ trip_point_temp_store(struct device *dev, struct device_attribute *attr,
>
> ret = tz->ops->set_trip_temp(tz, trip, temperature);
>
> + /*
> + * Notify bound cooling devices that this trip point changed.
> + * This is useful for cooling devices which represent a behavior
> + * which trips in hardware (e.g. catastrophic shutdown)
> + */
> + list_for_each_entry(pos, &tz->thermal_instances, tz_node) {
> + if (pos->tz == tz && pos->trip == trip && pos->cdev) {
> + if (pos->cdev->ops->trip_point_changed)
> + pos->cdev->ops->trip_point_changed(pos->cdev,
> + pos->tz,
> + trip);
> + }
> + }
> +
> return ret ? ret : count;
> }
>
> @@ -641,6 +656,7 @@ trip_point_hyst_store(struct device *dev, struct device_attribute *attr,
> struct thermal_zone_device *tz = to_thermal_zone(dev);
> int trip, ret;
> unsigned long temperature;
> + struct thermal_instance *pos = NULL;
>
> if (!tz->ops->set_trip_hyst)
> return -EPERM;
> @@ -658,6 +674,20 @@ trip_point_hyst_store(struct device *dev, struct device_attribute *attr,
> */
> ret = tz->ops->set_trip_hyst(tz, trip, temperature);
>
> + /*
> + * Notify bound cooling devices that this trip point changed.
> + * This is useful for cooling devices which represent a behavior
> + * which trips in hardware (e.g. catastrophic shutdown)
> + */
> + list_for_each_entry(pos, &tz->thermal_instances, tz_node) {
> + if (pos->tz == tz && pos->trip == trip && pos->cdev) {
> + if (pos->cdev->ops->trip_point_changed)
> + pos->cdev->ops->trip_point_changed(pos->cdev,
> + pos->tz,
> + trip);
> + }
> + }
> +
> return ret ? ret : count;
> }
>
> diff --git a/include/linux/thermal.h b/include/linux/thermal.h
> index f7e11c7..7da7fc5 100644
> --- a/include/linux/thermal.h
> +++ b/include/linux/thermal.h
> @@ -138,6 +138,8 @@ struct thermal_cooling_device_ops {
> int (*get_max_state) (struct thermal_cooling_device *, unsigned long *);
> int (*get_cur_state) (struct thermal_cooling_device *, unsigned long *);
> int (*set_cur_state) (struct thermal_cooling_device *, unsigned long);
> + void (*trip_point_changed) (struct thermal_cooling_device *,
> + struct thermal_zone_device *, int);
> };
>
> struct thermal_cooling_device {

2014-07-31 17:12:58

by Matt Longnecker

[permalink] [raw]
Subject: Re: [PATCH] thermal: tell cooling devices when a trip_point changes

I should have labeled my message RFC. My intent was to generate some
public discussion about a problem I've hit -- not to get this particular
patch merged.

On 07/31/2014 01:30 AM, Zhang Rui wrote:
> On Wed, 2014-07-30 at 16:10 -0700, Matt Longnecker wrote:
>> Some hardware can react autonomously at a programmed temperature.
>
> if you have a temperature sensor, then you should have a thermal zone
> device driver for it, right?
>

Yes, but....

In existing "downstream" Tegra kernels, thermal sensor drivers directly
register thermal zones with thermal_core.c. Drivers get information for
the zone (e.g. trip points, cooling maps, etc). Each driver has its own
format for this zone-related information.

Now we're moving away from board files and toward device tree.

of-thermal offers a canonical way to encode zone-related information in
device tree. I don't want to invent a competing alternative.

As of-thermal parses device tree, it creates thermal zones. Sensor
drivers no longer need to register zones -- they just register with
of-thermal as sensor drivers.

of-thermal hides trip points and cooling maps from sensor drivers. This
is good and bad. It's good because it eliminates a _lot_ of thermal zone
boilerplate code from sensor drivers. It's bad because it breaks an
existing usecase:
* the kernel configures the hardware thermal reaction (e.g. catastrophic
shutdown) according to platform-data provided at boot
* a user adjusts the shutdown temperature at runtime by poking at
/sys/class/thermal/thermal_zone*/trip_point_*_temp

My patch offers one way to support this usecase in systems using
of-thermal to manage thermal zones. I'm happy to discuss alternatives.

>> For example, an SoC might implement a last ditch throttle or a
>> hardware thermal shutdown.
>
> And it should be handled by the thermal zone driver, who has the
> knowledge of when to do throttle/shutdown.
>

of-thermal doesn't support this because it doesn't propagate trip point
changes to sensor drivers.

Thanks,
Matt Longnecker