Add "nvidia,thermtrip" property to implement HW and SW
shutdown.
Wei Ni (3):
of: add nvidia,thermtrips property
thermal: tegra: support hw and sw shutdown
arm64: dts: tegra210: set thermtrip
.../bindings/thermal/nvidia,tegra124-soctherm.txt | 20 ++++-
arch/arm64/boot/dts/nvidia/tegra210.dtsi | 15 ++--
drivers/thermal/tegra/soctherm.c | 99 ++++++++++++++++++----
drivers/thermal/tegra/soctherm.h | 6 ++
drivers/thermal/tegra/tegra210-soctherm.c | 8 ++
5 files changed, 124 insertions(+), 24 deletions(-)
--
2.7.4
Add optional property "nvidia,thermtrips".
If present, these trips will be used as HW shutdown trips,
and critical trips will be used as SW shutdown trips.
Signed-off-by: Wei Ni <[email protected]>
---
.../bindings/thermal/nvidia,tegra124-soctherm.txt | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt
index b6c0ae53d4dc..ab66d6feab4b 100644
--- a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt
+++ b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt
@@ -55,10 +55,21 @@ Required properties :
- #cooling-cells: Should be 1. This cooling device only support on/off state.
See ./thermal.txt for a description of this property.
+Optional properties:
+- nvidia,thermtrips : When present, this property specifies the temperature at
+ which the soctherm hardware will assert the thermal trigger signal to the
+ Power Management IC, which can be configured to reset or shutdown the device.
+ It is an array of pairs where each pair represents a tsensor id followed by a
+ temperature in milli Celcius. In the absence of this property the critical
+ trip point will be used for thermtrip temperature.
+
Note:
-- the "critical" type trip points will be set to SOC_THERM hardware as the
-shut down temperature. Once the temperature of this thermal zone is higher
-than it, the system will be shutdown or reset by hardware.
+- the "critical" type trip points will be used to set the temperature at which
+the SOC_THERM hardware will assert a thermal trigger if the "nvidia,thermtrips"
+property is missing. When the thermtrips property is present, the breach of a
+critical trip point is reported back to the thermal framework to implement
+software shutdown.
+
- the "hot" type trip points will be set to SOC_THERM hardware as the throttle
temperature. Once the the temperature of this thermal zone is higher
than it, it will trigger the HW throttle event.
@@ -79,6 +90,9 @@ Example :
#thermal-sensor-cells = <1>;
+ nvidia,thermtrips = <TEGRA124_SOCTHERM_SENSOR_CPU 102500
+ TEGRA124_SOCTHERM_SENSOR_GPU 103000>;
+
throttle-cfgs {
/*
* When the "heavy" cooling device triggered,
--
2.7.4
Currently the critical trip points in thermal framework are the only
way to specify a temperature at which HW should shutdown. This is
insufficient for certain platforms which would want an orderly
software shutdown in addition to HW shutdown.
This change support to parse "nvidia, thermtrips" property,
it allows soctherm DT to specify thermtrip temperatures so that
critical trip points framework can be used for doing software
shutdown.
Signed-off-by: Wei Ni <[email protected]>
---
drivers/thermal/tegra/soctherm.c | 99 ++++++++++++++++++++++++++-----
drivers/thermal/tegra/soctherm.h | 6 ++
drivers/thermal/tegra/tegra210-soctherm.c | 8 +++
3 files changed, 98 insertions(+), 15 deletions(-)
diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
index ed28110a3535..673c3ffa9001 100644
--- a/drivers/thermal/tegra/soctherm.c
+++ b/drivers/thermal/tegra/soctherm.c
@@ -446,6 +446,24 @@ find_throttle_cfg_by_name(struct tegra_soctherm *ts, const char *name)
return NULL;
}
+static int tsensor_group_thermtrip_get(struct tegra_soctherm *ts, int id)
+{
+ int i, temp = min_low_temp;
+ struct tsensor_group_thermtrips *tt = ts->soc->thermtrips;
+
+ if (id >= TEGRA124_SOCTHERM_SENSOR_NUM)
+ return temp;
+
+ if (tt) {
+ for (i = 0; i < ts->soc->num_ttgs; i++) {
+ if (tt[i].id == id)
+ return tt[i].temp;
+ }
+ }
+
+ return temp;
+}
+
static int tegra_thermctl_set_trip_temp(void *data, int trip, int temp)
{
struct tegra_thermctl_zone *zone = data;
@@ -464,7 +482,16 @@ static int tegra_thermctl_set_trip_temp(void *data, int trip, int temp)
return ret;
if (type == THERMAL_TRIP_CRITICAL) {
- return thermtrip_program(dev, sg, temp);
+ /*
+ * If thermtrips property is set in DT,
+ * doesn't need to program critical type trip to HW,
+ * if not, program critical trip to HW.
+ */
+ if (min_low_temp == tsensor_group_thermtrip_get(ts, sg->id))
+ return thermtrip_program(dev, sg, temp);
+ else
+ return 0;
+
} else if (type == THERMAL_TRIP_HOT) {
int i;
@@ -523,7 +550,8 @@ static int get_hot_temp(struct thermal_zone_device *tz, int *trip, int *temp)
* @dev: struct device * of the SOC_THERM instance
*
* Configure the SOC_THERM HW trip points, setting "THERMTRIP"
- * "THROTTLE" trip points , using "critical" or "hot" type trip_temp
+ * "THROTTLE" trip points , using "thermtrips", "critical" or "hot"
+ * type trip_temp
* from thermal zone.
* After they have been configured, THERMTRIP or THROTTLE will take
* action when the configured SoC thermal sensor group reaches a
@@ -545,28 +573,23 @@ static int tegra_soctherm_set_hwtrips(struct device *dev,
{
struct tegra_soctherm *ts = dev_get_drvdata(dev);
struct soctherm_throt_cfg *stc;
- int i, trip, temperature;
- int ret;
+ int i, trip, temperature, ret;
- ret = tz->ops->get_crit_temp(tz, &temperature);
- if (ret) {
- dev_warn(dev, "thermtrip: %s: missing critical temperature\n",
- sg->name);
- goto set_throttle;
- }
+ /* Get thermtrips. If missing, try to get critical trips. */
+ temperature = tsensor_group_thermtrip_get(ts, sg->id);
+ if (min_low_temp == temperature)
+ if (tz->ops->get_crit_temp(tz, &temperature))
+ temperature = max_high_temp;
ret = thermtrip_program(dev, sg, temperature);
if (ret) {
- dev_err(dev, "thermtrip: %s: error during enable\n",
- sg->name);
+ dev_err(dev, "thermtrip: %s: error during enable\n", sg->name);
return ret;
}
- dev_info(dev,
- "thermtrip: will shut down when %s reaches %d mC\n",
+ dev_info(dev, "thermtrip: will shut down when %s reaches %d mC\n",
sg->name, temperature);
-set_throttle:
ret = get_hot_temp(tz, &trip, &temperature);
if (ret) {
dev_warn(dev, "throttrip: %s: missing hot temperature\n",
@@ -907,6 +930,50 @@ static const struct thermal_cooling_device_ops throt_cooling_ops = {
.set_cur_state = throt_set_cdev_state,
};
+static int soctherm_thermtrips_parse(struct platform_device *pdev)
+{
+ struct device *dev = &pdev->dev;
+ struct tegra_soctherm *ts = dev_get_drvdata(dev);
+ struct tsensor_group_thermtrips *tt = ts->soc->thermtrips;
+ const int max_num_prop = ts->soc->num_ttgs * 2;
+ u32 *tlb;
+ int i, j, n, ret;
+
+ if (!tt)
+ return -ENOMEM;
+
+ n = of_property_count_u32_elems(dev->of_node, "nvidia,thermtrips");
+ if (n <= 0) {
+ dev_info(dev,
+ "missing thermtrips, will use critical trips as shut down temp\n");
+ return n;
+ }
+
+ n = min(max_num_prop, n);
+
+ tlb = devm_kcalloc(&pdev->dev, max_num_prop, sizeof(u32), GFP_KERNEL);
+ if (!tlb)
+ return -ENOMEM;
+ ret = of_property_read_u32_array(dev->of_node, "nvidia,thermtrips",
+ tlb, n);
+ if (ret) {
+ dev_err(dev, "invalid num ele: thermtrips:%d\n", ret);
+ return ret;
+ }
+
+ i = 0;
+ for (j = 0; j < n; j = j + 2) {
+ if (tlb[j] >= TEGRA124_SOCTHERM_SENSOR_NUM)
+ continue;
+
+ tt[i].id = tlb[j];
+ tt[i].temp = tlb[j+1];
+ i++;
+ }
+
+ return 0;
+}
+
/**
* soctherm_init_hw_throt_cdev() - Parse the HW throttle configurations
* and register them as cooling devices.
@@ -1348,6 +1415,8 @@ static int tegra_soctherm_probe(struct platform_device *pdev)
if (err)
return err;
+ soctherm_thermtrips_parse(pdev);
+
soctherm_init_hw_throt_cdev(pdev);
soctherm_init(pdev);
diff --git a/drivers/thermal/tegra/soctherm.h b/drivers/thermal/tegra/soctherm.h
index e96ca73fd780..c05c7e37e968 100644
--- a/drivers/thermal/tegra/soctherm.h
+++ b/drivers/thermal/tegra/soctherm.h
@@ -92,6 +92,11 @@ struct tegra_tsensor {
const struct tegra_tsensor_group *group;
};
+struct tsensor_group_thermtrips {
+ u8 id;
+ u32 temp;
+};
+
struct tegra_soctherm_fuse {
u32 fuse_base_cp_mask, fuse_base_cp_shift;
u32 fuse_base_ft_mask, fuse_base_ft_shift;
@@ -113,6 +118,7 @@ struct tegra_soctherm_soc {
const int thresh_grain;
const unsigned int bptt;
const bool use_ccroc;
+ struct tsensor_group_thermtrips *thermtrips;
};
int tegra_calc_shared_calib(const struct tegra_soctherm_fuse *tfuse,
diff --git a/drivers/thermal/tegra/tegra210-soctherm.c b/drivers/thermal/tegra/tegra210-soctherm.c
index ad53169a8e95..0a0c3cec7134 100644
--- a/drivers/thermal/tegra/tegra210-soctherm.c
+++ b/drivers/thermal/tegra/tegra210-soctherm.c
@@ -203,6 +203,13 @@ static const struct tegra_soctherm_fuse tegra210_soctherm_fuse = {
.fuse_spare_realignment = 0,
};
+struct tsensor_group_thermtrips tegra210_tsensor_thermtrips[] = {
+ {.id = TEGRA124_SOCTHERM_SENSOR_NUM},
+ {.id = TEGRA124_SOCTHERM_SENSOR_NUM},
+ {.id = TEGRA124_SOCTHERM_SENSOR_NUM},
+ {.id = TEGRA124_SOCTHERM_SENSOR_NUM},
+};
+
const struct tegra_soctherm_soc tegra210_soctherm = {
.tsensors = tegra210_tsensors,
.num_tsensors = ARRAY_SIZE(tegra210_tsensors),
@@ -212,4 +219,5 @@ const struct tegra_soctherm_soc tegra210_soctherm = {
.thresh_grain = TEGRA210_THRESH_GRAIN,
.bptt = TEGRA210_BPTT,
.use_ccroc = false,
+ .thermtrips = tegra210_tsensor_thermtrips,
};
--
2.7.4
Set "nvidia,thermtrips" property, it used to set
HW shutdown temperatures.
Signed-off-by: Wei Ni <[email protected]>
---
arch/arm64/boot/dts/nvidia/tegra210.dtsi | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/boot/dts/nvidia/tegra210.dtsi b/arch/arm64/boot/dts/nvidia/tegra210.dtsi
index 8fe47d6445a5..f2e89b218b23 100644
--- a/arch/arm64/boot/dts/nvidia/tegra210.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra210.dtsi
@@ -1330,6 +1330,9 @@
reset-names = "soctherm";
#thermal-sensor-cells = <1>;
+ nvidia,thermtrips = <TEGRA124_SOCTHERM_SENSOR_CPU 102500
+ TEGRA124_SOCTHERM_SENSOR_GPU 103000>;
+
throttle-cfgs {
throttle_heavy: heavy {
nvidia,priority = <100>;
@@ -1349,8 +1352,8 @@
<&soctherm TEGRA124_SOCTHERM_SENSOR_CPU>;
trips {
- cpu-shutdown-trip {
- temperature = <102500>;
+ cpu-critical-trip {
+ temperature = <102000>;
hysteresis = <0>;
type = "critical";
};
@@ -1377,7 +1380,7 @@
<&soctherm TEGRA124_SOCTHERM_SENSOR_MEM>;
trips {
- mem-shutdown-trip {
+ mem-critical-trip {
temperature = <103000>;
hysteresis = <0>;
type = "critical";
@@ -1399,8 +1402,8 @@
<&soctherm TEGRA124_SOCTHERM_SENSOR_GPU>;
trips {
- gpu-shutdown-trip {
- temperature = <103000>;
+ gpu-critical-trip {
+ temperature = <102500>;
hysteresis = <0>;
type = "critical";
};
@@ -1427,7 +1430,7 @@
<&soctherm TEGRA124_SOCTHERM_SENSOR_PLLX>;
trips {
- pllx-shutdown-trip {
+ pllx-critical-trip {
temperature = <103000>;
hysteresis = <0>;
type = "critical";
--
2.7.4
On Fri, Dec 07, 2018 at 06:10:05PM +0800, Wei Ni wrote:
> Add optional property "nvidia,thermtrips".
> If present, these trips will be used as HW shutdown trips,
> and critical trips will be used as SW shutdown trips.
>
> Signed-off-by: Wei Ni <[email protected]>
> ---
> .../bindings/thermal/nvidia,tegra124-soctherm.txt | 20 +++++++++++++++++---
> 1 file changed, 17 insertions(+), 3 deletions(-)
This seems like an odd exception. Why not extend the list of trip point
types with a "shutdown" or "emergency" type that can be used for this?
This doesn't seem like NVIDIA specific functionality, so adding an
NVIDIA specific property doesn't seem right.
Also, please always Cc [email protected] and the device tree
bindings maintainers when sending updates for a binding. They need to
ack these kinds of patches and they can't do that if they don't get a
copy of the patch.
Cc'ing them now and quoting the full patch for reference.
Thierry
>
> diff --git a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt
> index b6c0ae53d4dc..ab66d6feab4b 100644
> --- a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt
> +++ b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt
> @@ -55,10 +55,21 @@ Required properties :
> - #cooling-cells: Should be 1. This cooling device only support on/off state.
> See ./thermal.txt for a description of this property.
>
> +Optional properties:
> +- nvidia,thermtrips : When present, this property specifies the temperature at
> + which the soctherm hardware will assert the thermal trigger signal to the
> + Power Management IC, which can be configured to reset or shutdown the device.
> + It is an array of pairs where each pair represents a tsensor id followed by a
> + temperature in milli Celcius. In the absence of this property the critical
> + trip point will be used for thermtrip temperature.
> +
> Note:
> -- the "critical" type trip points will be set to SOC_THERM hardware as the
> -shut down temperature. Once the temperature of this thermal zone is higher
> -than it, the system will be shutdown or reset by hardware.
> +- the "critical" type trip points will be used to set the temperature at which
> +the SOC_THERM hardware will assert a thermal trigger if the "nvidia,thermtrips"
> +property is missing. When the thermtrips property is present, the breach of a
> +critical trip point is reported back to the thermal framework to implement
> +software shutdown.
> +
> - the "hot" type trip points will be set to SOC_THERM hardware as the throttle
> temperature. Once the the temperature of this thermal zone is higher
> than it, it will trigger the HW throttle event.
> @@ -79,6 +90,9 @@ Example :
>
> #thermal-sensor-cells = <1>;
>
> + nvidia,thermtrips = <TEGRA124_SOCTHERM_SENSOR_CPU 102500
> + TEGRA124_SOCTHERM_SENSOR_GPU 103000>;
> +
> throttle-cfgs {
> /*
> * When the "heavy" cooling device triggered,
> --
> 2.7.4
>
On 14/12/2018 10:29 PM, Thierry Reding wrote:
> On Fri, Dec 07, 2018 at 06:10:05PM +0800, Wei Ni wrote:
>> Add optional property "nvidia,thermtrips".
>> If present, these trips will be used as HW shutdown trips,
>> and critical trips will be used as SW shutdown trips.
>>
>> Signed-off-by: Wei Ni <[email protected]>
>> ---
>> .../bindings/thermal/nvidia,tegra124-soctherm.txt | 20 +++++++++++++++++---
>> 1 file changed, 17 insertions(+), 3 deletions(-)
>
> This seems like an odd exception. Why not extend the list of trip point
> types with a "shutdown" or "emergency" type that can be used for this?
> This doesn't seem like NVIDIA specific functionality, so adding an
> NVIDIA specific property doesn't seem right.
The thermal framework only support four types "active", "passive",
"hot", "critical". Normally if the "critical" trip is triggered, the
thermal framework will implement a software shutdown. In our soctherm
driver, we also set this "critical" trips to hardware, so it will cause
the HW shutdown directly.
This serial added the "nvidia,thermtrips" to set HW shutdown trips for
our NVIDIA specific functionality, and keep the "critical" trips to
implement SW shutdown.
For example, we will set "critical" trip to 102C, and set
""nvidia,thermtrips" to 103C, it mean if the temperature hit to 102, the
system will be software shutdown, if the temperature increase fast and
up to 103 directly, then it will cause the hardware shutdown.
Thanks.
Wei.
>
> Also, please always Cc [email protected] and the device tree
> bindings maintainers when sending updates for a binding. They need to
> ack these kinds of patches and they can't do that if they don't get a
> copy of the patch.
>
> Cc'ing them now and quoting the full patch for reference.
>
> Thierry
>
>>
>> diff --git a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt
>> index b6c0ae53d4dc..ab66d6feab4b 100644
>> --- a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt
>> +++ b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt
>> @@ -55,10 +55,21 @@ Required properties :
>> - #cooling-cells: Should be 1. This cooling device only support on/off state.
>> See ./thermal.txt for a description of this property.
>>
>> +Optional properties:
>> +- nvidia,thermtrips : When present, this property specifies the temperature at
>> + which the soctherm hardware will assert the thermal trigger signal to the
>> + Power Management IC, which can be configured to reset or shutdown the device.
>> + It is an array of pairs where each pair represents a tsensor id followed by a
>> + temperature in milli Celcius. In the absence of this property the critical
>> + trip point will be used for thermtrip temperature.
>> +
>> Note:
>> -- the "critical" type trip points will be set to SOC_THERM hardware as the
>> -shut down temperature. Once the temperature of this thermal zone is higher
>> -than it, the system will be shutdown or reset by hardware.
>> +- the "critical" type trip points will be used to set the temperature at which
>> +the SOC_THERM hardware will assert a thermal trigger if the "nvidia,thermtrips"
>> +property is missing. When the thermtrips property is present, the breach of a
>> +critical trip point is reported back to the thermal framework to implement
>> +software shutdown.
>> +
>> - the "hot" type trip points will be set to SOC_THERM hardware as the throttle
>> temperature. Once the the temperature of this thermal zone is higher
>> than it, it will trigger the HW throttle event.
>> @@ -79,6 +90,9 @@ Example :
>>
>> #thermal-sensor-cells = <1>;
>>
>> + nvidia,thermtrips = <TEGRA124_SOCTHERM_SENSOR_CPU 102500
>> + TEGRA124_SOCTHERM_SENSOR_GPU 103000>;
>> +
>> throttle-cfgs {
>> /*
>> * When the "heavy" cooling device triggered,
>> --
>> 2.7.4