2013-08-23 23:17:25

by Eduardo Valentin

[permalink] [raw]
Subject: [RFC PATCH 00/14] RFCv2: device thermal limits represented in device tree nodes

Hello all,

First of all, apologize for the very late answer. But I am resuming this work.

This RFC is about describing hardware thermal limits using device tree.

Based on the discussion on the first RFC:
http://lkml.org/lkml/2013/7/22/319

I have changed the code and the device tree bindings as follows:
1. Now bindings may have several thermal zones per device (req. by Wei Ni)
2. The constants used are now using C preprocessor macros (as per Stephen suggestion)
3. Improved the bindings to have a better representation of links between
involved entities (thermal zone, cooling devices, sensors), as suggested
by Pawel and Mark.
4. Couple of changes in code, as per previous reviews.
5. Removed policies from DT bindings.

In summary, based on the discussion of the first RFC, now the idea is
to have thermal zones nodes describing the thermal zones
and the hardware limits. Then sensor devices nodes will point to
thermal zones via phandles and cooling device will also point to
thermal zone via phandles.

So, I am also posting the patch series to dt mailing list as requested.

Please refer to patch 02/14 to a documentation of what I am proposing.

This patch series includes:
a. Required changes in cpufreq-cpu0 driver so that it loads cooling device
when requested.
b. The thermal zone builder based on thermal bindings
c. Changes in sensor drivers (as examples)
d. Changes in device tree DTS files (as examples)

This patch series depends on the fixes I sent to linux-pm:
http://lkml.org/lkml/2013/8/23/572

In the above link you will find also the request to have a flag to
determine if virtual hwmon device will be created, based on thermal device.

All best,

Eduardo Valentin (14):
cpufreq: cpufreq-cpu0: add dt node parsing for 'cooling-zones'
drivers: thermal: introduce device tree parser
hwmon: lm75: expose to thermal fw via DT nodes
hwmon: tmp102: expose to thermal fw via DT nodes
thermal: ti-soc-thermal: use thermal DT infrastructure
arm: dts: add omap4 CPU thermal data
arm: dts: add omap4430 thermal data
arm: dts: add omap4460 thermal data
arm: dts: point to cooling-zones on omap4430 cpu node
arm: dts: point to cooling-zones on omap4460 cpu node
arm: dts: add omap5 GPU thermal data
arm: dts: add omap5 CORE thermal data
arm: dts: add omap5 thermal data
arm: dts: point to cooling-zones on omap5 cpu node

.../devicetree/bindings/cpufreq/cpufreq-cpu0.txt | 4 +
.../devicetree/bindings/thermal/thermal.txt | 160 +++++++
arch/arm/boot/dts/omap4-cpu-thermal.dtsi | 40 ++
arch/arm/boot/dts/omap443x.dtsi | 6 +
arch/arm/boot/dts/omap4460.dtsi | 6 +
arch/arm/boot/dts/omap5-core-thermal.dtsi | 27 ++
arch/arm/boot/dts/omap5-gpu-thermal.dtsi | 27 ++
arch/arm/boot/dts/omap5.dtsi | 10 +
drivers/cpufreq/cpufreq-cpu0.c | 12 +
drivers/hwmon/lm75.c | 31 ++
drivers/hwmon/tmp102.c | 26 ++
drivers/thermal/Kconfig | 13 +
drivers/thermal/Makefile | 1 +
drivers/thermal/thermal_dt.c | 473 +++++++++++++++++++++
drivers/thermal/ti-soc-thermal/ti-thermal-common.c | 52 ++-
include/dt-bindings/thermal/thermal.h | 27 ++
include/linux/thermal.h | 3 +
17 files changed, 907 insertions(+), 11 deletions(-)
create mode 100644 Documentation/devicetree/bindings/thermal/thermal.txt
create mode 100644 arch/arm/boot/dts/omap4-cpu-thermal.dtsi
create mode 100644 arch/arm/boot/dts/omap5-core-thermal.dtsi
create mode 100644 arch/arm/boot/dts/omap5-gpu-thermal.dtsi
create mode 100644 drivers/thermal/thermal_dt.c
create mode 100644 include/dt-bindings/thermal/thermal.h

--
1.8.2.1.342.gfa7285d


2013-08-23 23:17:34

by Eduardo Valentin

[permalink] [raw]
Subject: [RFC PATCH 01/14] cpufreq: cpufreq-cpu0: add dt node parsing for 'cooling-zones'

This patch changes the cpufreq-cpu0 driver to consider if
a cpu needs cooling (with cpufreq). In case the cooling is needed,
it can be flagged at the cpu0 device tree node, with the list
of zones property 'cooling-zones'.

In case this list of zones is present, the driver will
load a cpufreq cooling device in the system. The cpufreq-cpu0
driver is not interested in determining how the system should
be using the cooling device. The driver is responsible
only of loading the cooling device.

Describing how the cooling device will be used can be
accomplished by setting up a thermal zone that references
and is composed by the cpufreq cooling device.

Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Viresh Kumar <[email protected]>
Cc: Grant Likely <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Eduardo Valentin <[email protected]>
---
Documentation/devicetree/bindings/cpufreq/cpufreq-cpu0.txt | 4 ++++
drivers/cpufreq/cpufreq-cpu0.c | 12 ++++++++++++
2 files changed, 16 insertions(+)

diff --git a/Documentation/devicetree/bindings/cpufreq/cpufreq-cpu0.txt b/Documentation/devicetree/bindings/cpufreq/cpufreq-cpu0.txt
index 051f764..add50f7 100644
--- a/Documentation/devicetree/bindings/cpufreq/cpufreq-cpu0.txt
+++ b/Documentation/devicetree/bindings/cpufreq/cpufreq-cpu0.txt
@@ -15,6 +15,9 @@ Optional properties:
- clock-latency: Specify the possible maximum transition latency for clock,
in unit of nanoseconds.
- voltage-tolerance: Specify the CPU voltage tolerance in percentage.
+- cooling-zones: A list of thermal zones phandles. The generic cpu
+ cooling (freq clipping) is loaded by the generic cpufreq-cpu0 driver
+ in case the device tree node has this list.

Examples:

@@ -33,6 +36,7 @@ cpus {
198000 850000
>;
clock-latency = <61036>; /* two CLK32 periods */
+ cooling-zones = <&cpu_thermal>;
};

cpu@1 {
diff --git a/drivers/cpufreq/cpufreq-cpu0.c b/drivers/cpufreq/cpufreq-cpu0.c
index ad1fde2..ede6487 100644
--- a/drivers/cpufreq/cpufreq-cpu0.c
+++ b/drivers/cpufreq/cpufreq-cpu0.c
@@ -20,6 +20,9 @@
#include <linux/platform_device.h>
#include <linux/regulator/consumer.h>
#include <linux/slab.h>
+#include <linux/thermal.h>
+#include <linux/cpu_cooling.h>
+#include <linux/cpumask.h>

static unsigned int transition_latency;
static unsigned int voltage_tolerance; /* in percentage */
@@ -28,6 +31,7 @@ static struct device *cpu_dev;
static struct clk *cpu_clk;
static struct regulator *cpu_reg;
static struct cpufreq_frequency_table *freq_table;
+static struct thermal_cooling_device *cdev;

static int cpu0_verify_speed(struct cpufreq_policy *policy)
{
@@ -268,6 +272,13 @@ static int cpu0_cpufreq_probe(struct platform_device *pdev)
goto out_free_table;
}

+ /*
+ * For now, just loading the cooling device;
+ * thermal DT code takes care of matching them.
+ */
+ if (of_find_property(np, "cooling-zones", NULL))
+ cdev = cpufreq_cooling_register(cpu_present_mask);
+
of_node_put(np);
of_node_put(parent);
return 0;
@@ -283,6 +294,7 @@ out_put_parent:

static int cpu0_cpufreq_remove(struct platform_device *pdev)
{
+ cpufreq_cooling_unregister(cdev);
cpufreq_unregister_driver(&cpu0_cpufreq_driver);
opp_free_cpufreq_table(cpu_dev, &freq_table);

--
1.8.2.1.342.gfa7285d

2013-08-23 23:17:44

by Eduardo Valentin

[permalink] [raw]
Subject: [RFC PATCH 02/14] drivers: thermal: introduce device tree parser

In order to be able to build thermal policies
based on generic sensors, like I2C device, that
can be places in different points on different boards,
there is a need to have a way to feed board dependent
data into the thermal framework.

This patch introduces a thermal data parser for device
tree. The parsed data is used to build thermal zones
and thermal binding parameters. The output data
can then be used to deploy thermal policies.

This patch adds also documentation regarding this
API and how to define tree nodes to use
this infrastructure.

Cc: Zhang Rui <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Eduardo Valentin <[email protected]>
---
.../devicetree/bindings/thermal/thermal.txt | 160 +++++++
drivers/thermal/Kconfig | 13 +
drivers/thermal/Makefile | 1 +
drivers/thermal/thermal_dt.c | 473 +++++++++++++++++++++
include/dt-bindings/thermal/thermal.h | 27 ++
include/linux/thermal.h | 3 +
6 files changed, 677 insertions(+)
create mode 100644 Documentation/devicetree/bindings/thermal/thermal.txt
create mode 100644 drivers/thermal/thermal_dt.c
create mode 100644 include/dt-bindings/thermal/thermal.h

diff --git a/Documentation/devicetree/bindings/thermal/thermal.txt b/Documentation/devicetree/bindings/thermal/thermal.txt
new file mode 100644
index 0000000..af20ab0
--- /dev/null
+++ b/Documentation/devicetree/bindings/thermal/thermal.txt
@@ -0,0 +1,160 @@
+----------------------------------------
+Thermal Framework Device Tree descriptor
+----------------------------------------
+
+This file describes how to define a thermal structure using device tree.
+A thermal structure includes thermal zones and their components, such
+as name, trip points, polling intervals and cooling devices binding
+descriptors. A binding descriptor may contain information on how to
+react, with a cooling stepped action or a weight on a fair share.
+The target of device tree thermal descriptors is to describe only
+the hardware thermal aspects, not how the system must control or which
+algorithm or policy must take place. It follows a description of each
+type of device tree node.
+
+****
+trip
+****
+
+The trip node is a node to describe a point in the temperature domain
+in which the system takes an action. This node describes just the point,
+not the action.
+
+A node describing a trip must contain:
+- temperature: the trip temperature level, in milliCelsius.
+- hysteresis: a (low) hysteresis value on 'temperature'. This is a relative
+value, in milliCelsius.
+- type: the trip type. Here is the type mapping:
+ THERMAL_TRIP_ACTIVE
+ THERMAL_TRIP_PASSIVE
+ THERMAL_TRIP_HOT
+ THERMAL_TRIP_CRITICAL
+
+**********
+bind_param
+**********
+
+The bind_param node is a node to describe how cooling devices get assigned
+to trip points of the zone. The cooling devices are expected to be loaded
+in the target system.
+
+A node describing a bind_param must contain:
+- cooling_device: A string with the cooling device name.
+- weight: The 'influence' of a particular cooling device on this zone.
+ This is on a percentage scale. The sum of all these weights
+ (for a particular zone) cannot exceed 100.
+- trip_mask: This is a bit mask that gives the binding relation between
+ this thermal zone and cdev, for a particular trip point.
+ If nth bit is set, then the cdev and thermal zone are bound
+ for trip point n.
+- limits: An array integers consisting of 2-tuples items, and each item means
+ lower and upper state limits, like <min-state max-state>.
+
+**************************
+temperature sensor devices
+**************************
+
+Temperature sensor devices have to be linked to a specific thermal zone.
+This is done by means of the 'monitored-zones' list.
+- monitored-zones: A list of thermal zones phandles, identifying which
+thermal zones the temperature sensor device is used to monitor.
+
+************
+thermal_zone
+************
+
+The thermal_zone node is the node containing all the required info
+for describing a thermal zone, including its cdev bindings. The thermal_zone
+node must contain, apart from its own properties, one node containing
+trip nodes and one node containing all the zone bind parameters.
+
+Required properties:
+- mask: Bit string: If 'n'th bit is set, then trip point 'n' is writeable.
+
+- passive_delay: number of milliseconds to wait between polls when
+ performing passive cooling.
+
+- polling_delay: number of milliseconds to wait between polls when checking.
+
+- #thermal-cells: An integer that is used to specify how many cells compose
+ sensor ids.
+
+Below is an example:
+cpu_thermal: thermal_zone {
+ #thermal-cells = <1>;
+ mask = <0x03>; /* trips writability */
+ passive_delay = <250>; /* milliseconds */
+ polling_delay = <1000>; /* milliseconds */
+ trips {
+ alert@100000{
+ temperature = <100000>; /* milliCelsius */
+ hysteresis = <0>; /* milliCelsius */
+ type = <THERMAL_TRIP_PASSIVE>;
+ };
+ crit@125000{
+ temperature = <125000>; /* milliCelsius */
+ hysteresis = <0>; /* milliCelsius */
+ type = <THERMAL_TRIP_CRITICAL>;
+ };
+ };
+ bind_params {
+ action@0{
+ cooling_device = "thermal-cpufreq";
+ weight = <100>; /* percentage */
+ mask = <0x01>;
+ limits = <
+ /* lower-state upper-state */
+ 1 THERMAL_NO_LIMITS
+ THERMAL_NO_LIMITS THERMAL_NO_LIMITS
+ >;
+ };
+ };
+};
+
+sensor: adc@0xFFFF {
+ ...
+ monitored-zones = <&cpu_thermal 0>;
+};
+
+cpu@0 {
+ ...
+ cooling-zones = <&cpu_thermal>;
+};
+
+In the example above, the ADC sensor at address 0xFFFF is used to monitor
+the zone 'cpu_thermal' using its the sensor 0. The cpu@0 device is also linked
+to the same thermal zone 'cpu_thermal' as a cooling device.
+
+***************
+cpufreq-cooling
+***************
+
+The generic cpu cooling (freq clipping) can be loaded by the
+generic cpufreq-cpu0 driver in case the device tree node informs
+this need.
+
+In order to load the cpufreq cooling device, it is possible to
+inform that the cpu needs cooling by adding the list of thermal zones
+in the 'cooling-zones' property at the cpu0 phandle.
+
+Example:
+ cpus {
+ cpu@0 {
+ operating-points = <
+ /* kHz uV */
+ 800000 1313000
+ 1008000 1375000
+ >;
+ cooling-zones = <&cpu_thermal>;
+ };
+ ...
+ };
+
+The above will cause the cpufreq-cpu0 driver to understand that
+the cpu will need passive cooling and while probing that node it will
+also load the cpufreq cpu cooling device in that system.
+
+However, be advised that this flag will not describe what needs to
+be done or how the cpufreq cooling device will be used. In order to
+accomplish this, one needs to write a phandle for a thermal zone, as
+described in the section 'thermal_zone'.
diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
index 7fb16bc..753f0dc 100644
--- a/drivers/thermal/Kconfig
+++ b/drivers/thermal/Kconfig
@@ -29,6 +29,19 @@ config THERMAL_HWMON
Say 'Y' here if you want all thermal sensors to
have hwmon sysfs interface too.

+config THERMAL_OF
+ bool
+ prompt "APIs to parse thermal data out of device tree"
+ depends on OF
+ default y
+ help
+ This options provides helpers to add the support to
+ read and parse thermal data definitions out of the
+ device tree blob.
+
+ Say 'Y' here if you need to build thermal infrastructure
+ based on device tree.
+
choice
prompt "Default Thermal governor"
default THERMAL_DEFAULT_GOV_STEP_WISE
diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
index 24cb894..eedb273 100644
--- a/drivers/thermal/Makefile
+++ b/drivers/thermal/Makefile
@@ -7,6 +7,7 @@ thermal_sys-y += thermal_core.o

# interface to/from other layers providing sensors
thermal_sys-$(CONFIG_THERMAL_HWMON) += thermal_hwmon.o
+thermal_sys-$(CONFIG_THERMAL_OF) += thermal_dt.o

# governors
thermal_sys-$(CONFIG_THERMAL_GOV_FAIR_SHARE) += fair_share.o
diff --git a/drivers/thermal/thermal_dt.c b/drivers/thermal/thermal_dt.c
new file mode 100644
index 0000000..cc35485
--- /dev/null
+++ b/drivers/thermal/thermal_dt.c
@@ -0,0 +1,473 @@
+/*
+ * thermal_dt.c - Generic Thermal Management device tree support.
+ *
+ * Copyright (C) 2013 Texas Instruments
+ * Copyright (C) 2013 Eduardo Valentin <[email protected]>
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
+ *
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ */
+#include <linux/thermal.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <linux/of_device.h>
+#include <linux/of_platform.h>
+#include <linux/err.h>
+#include <linux/export.h>
+#include <linux/string.h>
+
+struct __thermal_bind_params {
+ char cooling_device[THERMAL_NAME_LENGTH];
+};
+
+static
+int thermal_of_match(struct thermal_zone_device *tz,
+ struct thermal_cooling_device *cdev);
+
+static int thermal_of_populate_bind_params(struct device *dev,
+ struct device_node *node,
+ struct __thermal_bind_params *__tbp,
+ struct thermal_bind_params *tbp,
+ int ntrips)
+{
+ const char *cdev_name;
+ u32 *limits;
+ int ret;
+ u32 prop;
+
+ ret = of_property_read_u32(node, "weight", &prop);
+ if (ret < 0) {
+ dev_err(dev, "missing weight property\n");
+ return ret;
+ }
+ tbp->weight = prop;
+
+ ret = of_property_read_u32(node, "mask", &prop);
+ if (ret < 0) {
+ dev_err(dev, "missing mask property\n");
+ return ret;
+ }
+ tbp->trip_mask = prop;
+
+ /* this will allow us to bind with cooling devices */
+ tbp->match = thermal_of_match;
+
+ ret = of_property_read_string(node, "cooling_device", &cdev_name);
+ if (ret < 0) {
+ dev_err(dev, "missing cooling_device property\n");
+ return ret;
+ }
+ strncpy(__tbp->cooling_device, cdev_name,
+ sizeof(__tbp->cooling_device));
+
+ limits = kzalloc(ntrips * sizeof(*limits) * 2, GFP_KERNEL);
+ if (!limits) {
+ dev_err(dev, "not enough mem for reading limits\n");
+ return -ENOMEM;
+ }
+ ret = of_property_read_u32_array(node, "limits", limits, 2 * ntrips);
+ if (!ret) {
+ int i = ntrips;
+
+ tbp->binding_limits = devm_kzalloc(dev,
+ ntrips * 2 *
+ sizeof(*tbp->binding_limits),
+ GFP_KERNEL);
+ if (!tbp->binding_limits) {
+ dev_err(dev, "not enough mem for storing limits\n");
+ return -ENOMEM;
+ }
+ while (i--)
+ tbp->binding_limits[i] = limits[i];
+ }
+ kfree(limits);
+
+ return 0;
+}
+
+struct __thermal_trip {
+ unsigned long int temperature;
+ unsigned long int hysteresis;
+ enum thermal_trip_type type;
+};
+
+static
+int thermal_of_populate_trip(struct device *dev,
+ struct device_node *node,
+ struct __thermal_trip *trip)
+{
+ int ret;
+ int prop;
+
+ ret = of_property_read_u32(node, "temperature", &prop);
+ if (ret < 0) {
+ dev_err(dev, "missing temperature property\n");
+ return ret;
+ }
+ trip->temperature = prop;
+
+ ret = of_property_read_u32(node, "hysteresis", &prop);
+ if (ret < 0) {
+ dev_err(dev, "missing hysteresis property\n");
+ return ret;
+ }
+ trip->hysteresis = prop;
+
+ ret = of_property_read_u32(node, "type", &prop);
+ if (ret < 0) {
+ dev_err(dev, "missing type property\n");
+ return ret;
+ }
+ trip->type = prop;
+
+ return 0;
+}
+
+struct __thermal_zone_device {
+ enum thermal_device_mode mode;
+ int passive_delay;
+ int polling_delay;
+ int mask;
+ int ntrips;
+ char type[THERMAL_NAME_LENGTH];
+ struct __thermal_trip *trips;
+ struct __thermal_bind_params *bind_params;
+ struct thermal_bind_params *tbps;
+ struct thermal_zone_params zone_params;
+ int (*get_temp)(void *, unsigned long *);
+ void *devdata;
+};
+
+static struct __thermal_zone_device *
+thermal_of_build_thermal_zone(struct device *dev, struct device_node *node)
+{
+ struct device_node *child, *gchild;
+ struct __thermal_zone_device *tz;
+ int ret, i;
+ u32 prop;
+
+ if (!node) {
+ dev_err(dev, "no thermal zone node\n");
+ return ERR_PTR(-EINVAL);
+ }
+
+ tz = devm_kzalloc(dev, sizeof(*tz), GFP_KERNEL);
+ if (!tz) {
+ dev_err(dev, "not enough memory for thermal of zone\n");
+ return ERR_PTR(-ENOMEM);
+ }
+
+ ret = of_property_read_u32(node, "passive_delay", &prop);
+ if (ret < 0) {
+ dev_err(dev, "missing passive_delay property\n");
+ return ERR_PTR(ret);
+ }
+ tz->passive_delay = prop;
+
+ ret = of_property_read_u32(node, "polling_delay", &prop);
+ if (ret < 0) {
+ dev_err(dev, "missing polling_delay property\n");
+ return ERR_PTR(ret);
+ }
+ tz->polling_delay = prop;
+
+ ret = of_property_read_u32(node, "mask", &prop);
+ if (ret < 0) {
+ dev_err(dev, "missing mask property\n");
+ return ERR_PTR(ret);
+ }
+ tz->mask = prop;
+
+ /* assume node name as our zone type */
+ strncpy(tz->type, node->name, sizeof(tz->type));
+
+ /* default policy */
+ strncpy(tz->zone_params.governor_name, "step_wise",
+ sizeof(tz->zone_params.governor_name));
+
+ /* trips */
+ child = of_find_node_by_name(node, "trips");
+
+ /* No trips provided */
+ if (!child)
+ goto finish;
+
+ tz->ntrips = of_get_child_count(child);
+ tz->trips = devm_kzalloc(dev, tz->ntrips * sizeof(*tz->trips),
+ GFP_KERNEL);
+ if (!tz->trips)
+ return ERR_PTR(-ENOMEM);
+ i = 0;
+ for_each_child_of_node(child, gchild)
+ thermal_of_populate_trip(dev, gchild, &tz->trips[i++]);
+
+ /* bind_params */
+ child = of_find_node_by_name(node, "bind_params");
+
+ /* No binding info */
+ if (!child)
+ goto finish;
+
+ tz->zone_params.num_tbps = of_get_child_count(child);
+ tz->bind_params = devm_kzalloc(dev,
+ tz->zone_params.num_tbps *
+ sizeof(*tz->bind_params),
+ GFP_KERNEL);
+ if (!tz->bind_params)
+ return ERR_PTR(-ENOMEM);
+ tz->zone_params.tbp = devm_kzalloc(dev,
+ tz->zone_params.num_tbps *
+ sizeof(*tz->zone_params.tbp),
+ GFP_KERNEL);
+ if (!tz->zone_params.tbp)
+ return ERR_PTR(-ENOMEM);
+ i = 0;
+ for_each_child_of_node(child, gchild) {
+ thermal_of_populate_bind_params(dev, gchild,
+ &tz->bind_params[i],
+ &tz->zone_params.tbp[i],
+ tz->ntrips);
+ i++;
+ }
+
+finish:
+ tz->mode = THERMAL_DEVICE_ENABLED;
+
+ return tz;
+}
+
+static
+int thermal_of_match(struct thermal_zone_device *tz,
+ struct thermal_cooling_device *cdev)
+{
+ struct __thermal_zone_device *data = tz->devdata;
+ int i;
+
+ for (i = 0; i < data->zone_params.num_tbps; i++) {
+ if (!strncmp(data->bind_params[i].cooling_device,
+ cdev->type,
+ strlen(data->bind_params[i].cooling_device)) &&
+ (data->zone_params.tbp[i].trip_mask & (1 << i)))
+ return 0;
+ }
+
+ return -EINVAL;
+}
+
+static
+int of_thermal_get_temp(struct thermal_zone_device *tz,
+ unsigned long *temp)
+{
+ struct __thermal_zone_device *data = tz->devdata;
+
+ return data->get_temp(data->devdata, temp);
+}
+
+static
+int of_thermal_get_mode(struct thermal_zone_device *tz,
+ enum thermal_device_mode *mode)
+{
+ struct __thermal_zone_device *data = tz->devdata;
+
+ *mode = data->mode;
+
+ return 0;
+}
+
+static
+int of_thermal_set_mode(struct thermal_zone_device *tz,
+ enum thermal_device_mode mode)
+{
+ struct __thermal_zone_device *data = tz->devdata;
+
+ mutex_lock(&tz->lock);
+
+ if (mode == THERMAL_DEVICE_ENABLED)
+ tz->polling_delay = data->polling_delay;
+ else
+ tz->polling_delay = 0;
+
+ mutex_unlock(&tz->lock);
+
+ data->mode = mode;
+ thermal_zone_device_update(tz);
+
+ return 0;
+}
+
+static
+int of_thermal_get_trip_type(struct thermal_zone_device *tz, int trip,
+ enum thermal_trip_type *type)
+{
+ struct __thermal_zone_device *data = tz->devdata;
+
+ if (trip >= data->ntrips || trip < 0)
+ return -EDOM;
+
+ *type = data->trips[trip].type;
+
+ return 0;
+}
+
+static
+int of_thermal_get_trip_temp(struct thermal_zone_device *tz, int trip,
+ unsigned long *temp)
+{
+ struct __thermal_zone_device *data = tz->devdata;
+
+ if (trip >= data->ntrips || trip < 0)
+ return -EDOM;
+
+ *temp = data->trips[trip].temperature;
+
+ return 0;
+}
+
+static
+int of_thermal_set_trip_temp(struct thermal_zone_device *tz, int trip,
+ unsigned long temp)
+{
+ struct __thermal_zone_device *data = tz->devdata;
+
+ if (trip >= data->ntrips || trip < 0)
+ return -EDOM;
+
+ /* thermal fw should take care of data->mask & (1 << trip) */
+ data->trips[trip].temperature = temp;
+
+ return 0;
+}
+
+static
+int of_thermal_get_trip_hyst(struct thermal_zone_device *tz, int trip,
+ unsigned long *hyst)
+{
+ struct __thermal_zone_device *data = tz->devdata;
+
+ if (trip >= data->ntrips || trip < 0)
+ return -EDOM;
+
+ *hyst = data->trips[trip].hysteresis;
+
+ return 0;
+}
+
+static
+int of_thermal_set_trip_hyst(struct thermal_zone_device *tz, int trip,
+ unsigned long hyst)
+{
+ struct __thermal_zone_device *data = tz->devdata;
+
+ if (trip >= data->ntrips || trip < 0)
+ return -EDOM;
+
+ /* thermal fw should take care of data->mask & (1 << trip) */
+ data->trips[trip].hysteresis = hyst;
+
+ return 0;
+}
+
+static int
+of_thermal_get_crit_temp(struct thermal_zone_device *tz, unsigned long *temp)
+{
+ struct __thermal_zone_device *data = tz->devdata;
+ int i;
+
+ for (i = 0; i < data->ntrips; i++)
+ if (data->trips[i].type == THERMAL_TRIP_CRITICAL) {
+ *temp = data->trips[i].temperature;
+ return 0;
+ }
+
+ return -EINVAL;
+}
+
+static const
+struct thermal_zone_device_ops of_thermal_ops = {
+ .get_temp = of_thermal_get_temp,
+ .get_mode = of_thermal_get_mode,
+ .set_mode = of_thermal_set_mode,
+ .get_trip_type = of_thermal_get_trip_type,
+ .get_trip_temp = of_thermal_get_trip_temp,
+ .set_trip_temp = of_thermal_set_trip_temp,
+ .get_trip_hyst = of_thermal_get_trip_hyst,
+ .set_trip_hyst = of_thermal_set_trip_hyst,
+ .get_crit_temp = of_thermal_get_crit_temp,
+};
+
+/**
+ * thermal_zone_of_device_register() - registers a thermal zone based on DT
+ * @device: a valid struct device pointer containing device tree node
+ * with thermal_zone descriptor.
+ * @data: a private pointer (owned by the caller) that will be passed
+ * back, when a temperature reading is needed.
+ * @get_temp: a pointer to a function that returns the thermal zone
+ * temperature.
+ * @add_hwmon: a boolean to indicate if the thermal to hwmon sysfs interface
+ * is required. When add_hwmon == true, a hwmon sysfs interface
+ * will be created. When add_hwmon == false, nothing will be done
+ *
+ * This function will parse the device tree node associated with the
+ * @dev struct and will attempt to build a thermal zone device based
+ * upon the thermal data provided and described by DT.
+ *
+ * The thermal zone temperature is provided by the @get_temp function
+ * pointer. When called, it will have the private pointer @data back.
+ *
+ * Return: On success returns a valid struct thermal_zone_device,
+ * otherwise, it returns a corresponding ERR_PTR(). Caller must
+ * check the return value with help of IS_ERR() helper.
+ */
+struct thermal_zone_device *
+thermal_zone_of_device_register(struct device *dev, int id, bool add_hwmon,
+ void *data, int (*get_temp)(void *, unsigned long *))
+{
+ struct __thermal_zone_device *tz;
+ struct thermal_zone_device_ops *ops;
+ struct of_phandle_args thermal_spec;
+ int ret;
+
+ ret = of_parse_phandle_with_args(dev->of_node,
+ "monitored-zones",
+ "#thermal-cells",
+ id,
+ &thermal_spec);
+ if (ret) {
+ dev_err(dev, "unable to find desired monitored zone %d\n", id);
+ return ERR_PTR(ret);
+ }
+
+ tz = thermal_of_build_thermal_zone(dev, thermal_spec.np);
+ if (IS_ERR(tz)) {
+ dev_err(dev, "failed to build thermal zone\n");
+ return ERR_CAST(tz);
+ }
+ ops = devm_kzalloc(dev, sizeof(*ops), GFP_KERNEL);
+ if (!ops) {
+ dev_err(dev, "no memory available for thermal ops\n");
+ return ERR_PTR(-ENOMEM);
+ }
+ memcpy(ops, &of_thermal_ops, sizeof(*ops));
+ tz->get_temp = get_temp;
+ tz->devdata = data;
+
+ return thermal_zone_device_register(tz->type, tz->ntrips, tz->mask, tz,
+ ops, &tz->zone_params,
+ tz->passive_delay,
+ tz->polling_delay,
+ add_hwmon);
+}
+EXPORT_SYMBOL_GPL(thermal_zone_of_device_register);
diff --git a/include/dt-bindings/thermal/thermal.h b/include/dt-bindings/thermal/thermal.h
new file mode 100644
index 0000000..6dd6ccd
--- /dev/null
+++ b/include/dt-bindings/thermal/thermal.h
@@ -0,0 +1,27 @@
+/*
+ * This header provides constants for most thermal bindings.
+ *
+ * Copyright (C) 2013 Texas Instruments
+ * Eduardo Valentin <[email protected]>
+ *
+ * GPLv2 only
+ */
+
+#ifndef _DT_BINDINGS_THERMAL_THERMAL_H
+#define _DT_BINDINGS_THERMAL_THERMAL_H
+
+/*
+ * Here are the thermal trip types. This must
+ * match with enum thermal_trip_type at
+ * include/linux/thermal.h
+ */
+#define THERMAL_TRIP_ACTIVE 0
+#define THERMAL_TRIP_PASSIVE 1
+#define THERMAL_TRIP_HOT 2
+#define THERMAL_TRIP_CRITICAL 3
+
+/* On cooling devices upper and lower limits */
+#define THERMAL_NO_LIMIT (-1UL)
+
+#endif
+
diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index 7c2769c..501b2fe 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -234,6 +234,9 @@ struct thermal_genl_event {
};

/* Function declarations */
+struct thermal_zone_device
+*thermal_zone_of_device_register(struct device *, int id, bool add_hwmon,
+ void *data, int (*get_temp) (void *, unsigned long *));
struct thermal_zone_device *thermal_zone_device_register(const char *, int, int,
void *, const struct thermal_zone_device_ops *,
const struct thermal_zone_params *, int, int, bool);
--
1.8.2.1.342.gfa7285d

2013-08-23 23:17:50

by Eduardo Valentin

[permalink] [raw]
Subject: [RFC PATCH 03/14] hwmon: lm75: expose to thermal fw via DT nodes

This patch adds to lm75 temperature sensor the possibility
to expose itself as thermal zone device, registered on the
thermal framework.

The thermal zone is built only if a device tree node
describing a thermal zone for this sensor is present
inside the lm75 DT node. Otherwise, the driver behavior
will be the same.

Cc: Jean Delvare <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Eduardo Valentin <[email protected]>
---
drivers/hwmon/lm75.c | 31 +++++++++++++++++++++++++++++++
1 file changed, 31 insertions(+)

diff --git a/drivers/hwmon/lm75.c b/drivers/hwmon/lm75.c
index c03b490..dc55908 100644
--- a/drivers/hwmon/lm75.c
+++ b/drivers/hwmon/lm75.c
@@ -27,6 +27,8 @@
#include <linux/hwmon-sysfs.h>
#include <linux/err.h>
#include <linux/mutex.h>
+#include <linux/thermal.h>
+#include <linux/of.h>
#include "lm75.h"


@@ -70,6 +72,7 @@ static const u8 LM75_REG_TEMP[3] = {
/* Each client has this additional data */
struct lm75_data {
struct device *hwmon_dev;
+ struct thermal_zone_device *tz;
struct mutex update_lock;
u8 orig_conf;
u8 resolution; /* In bits, between 9 and 12 */
@@ -92,6 +95,19 @@ static struct lm75_data *lm75_update_device(struct device *dev);

/* sysfs attributes for hwmon */

+static int lm75_read_temp(void *dev, unsigned long *temp)
+{
+ struct lm75_data *data = lm75_update_device(dev);
+
+ if (IS_ERR(data))
+ return PTR_ERR(data);
+
+ *temp = ((data->temp[0] >> (16 - data->resolution)) * 1000) >>
+ (data->resolution - 8);
+
+ return 0;
+}
+
static ssize_t show_temp(struct device *dev, struct device_attribute *da,
char *buf)
{
@@ -271,11 +287,25 @@ lm75_probe(struct i2c_client *client, const struct i2c_device_id *id)
goto exit_remove;
}

+ if (of_find_property(client->dev.of_node, "monitored-zones", NULL)) {
+ data->tz = thermal_zone_of_device_register(&client->dev,
+ 0,
+ false, /* -hwmon */
+ &client->dev,
+ lm75_read_temp);
+ if (IS_ERR(data->tz)) {
+ status = PTR_ERR(data->tz);
+ goto exit_hwmon;
+ }
+ }
+
dev_info(&client->dev, "%s: sensor '%s'\n",
dev_name(data->hwmon_dev), client->name);

return 0;

+exit_hwmon:
+ hwmon_device_unregister(data->hwmon_dev);
exit_remove:
sysfs_remove_group(&client->dev.kobj, &lm75_group);
return status;
@@ -285,6 +315,7 @@ static int lm75_remove(struct i2c_client *client)
{
struct lm75_data *data = i2c_get_clientdata(client);

+ thermal_zone_device_unregister(data->tz);
hwmon_device_unregister(data->hwmon_dev);
sysfs_remove_group(&client->dev.kobj, &lm75_group);
lm75_write_value(client, LM75_REG_CONF, data->orig_conf);
--
1.8.2.1.342.gfa7285d

2013-08-23 23:18:06

by Eduardo Valentin

[permalink] [raw]
Subject: [RFC PATCH 05/14] thermal: ti-soc-thermal: use thermal DT infrastructure

This patch improves the ti-soc-thermal driver by adding the
support to build the thermal zones based on DT nodes.

The driver will have two options now to build the thermal
zones. The first option is the zones originally coded
in this driver. So, the driver behavior will be same
if there is no DT node describing the zones. The second
option, when it is found a DT node with thermal data,
will used the common infrastructure to build the thermal
zone and bind its cooling devices.

In case the driver loads thermal data using the legacy
mode, this driver still adds to the system
a cpufreq cooling device. Loading the thermal data from
DT, the driver assumes someone else will add the cpufreq
cooling device, like the cpufreq driver.

Cc: Zhang Rui <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Eduardo Valentin <[email protected]>
---
drivers/thermal/ti-soc-thermal/ti-thermal-common.c | 52 +++++++++++++++++-----
1 file changed, 41 insertions(+), 11 deletions(-)

diff --git a/drivers/thermal/ti-soc-thermal/ti-thermal-common.c b/drivers/thermal/ti-soc-thermal/ti-thermal-common.c
index 5ab613a..6aac72c 100644
--- a/drivers/thermal/ti-soc-thermal/ti-thermal-common.c
+++ b/drivers/thermal/ti-soc-thermal/ti-thermal-common.c
@@ -31,6 +31,7 @@
#include <linux/cpufreq.h>
#include <linux/cpumask.h>
#include <linux/cpu_cooling.h>
+#include <linux/of.h>

#include "ti-thermal.h"
#include "ti-bandgap.h"
@@ -75,11 +76,10 @@ static inline int ti_thermal_hotspot_temperature(int t, int s, int c)

/* thermal zone ops */
/* Get temperature callback function for thermal zone*/
-static inline int ti_thermal_get_temp(struct thermal_zone_device *thermal,
- unsigned long *temp)
+static inline int __ti_thermal_get_temp(void *devdata, unsigned long *temp)
{
struct thermal_zone_device *pcb_tz = NULL;
- struct ti_thermal_data *data = thermal->devdata;
+ struct ti_thermal_data *data = devdata;
struct ti_bandgap *bgp;
const struct ti_temp_sensor *s;
int ret, tmp, slope, constant;
@@ -117,6 +117,14 @@ static inline int ti_thermal_get_temp(struct thermal_zone_device *thermal,
return ret;
}

+static inline int ti_thermal_get_temp(struct thermal_zone_device *thermal,
+ unsigned long *temp)
+{
+ struct ti_thermal_data *data = thermal->devdata;
+
+ return __ti_thermal_get_temp(data, temp);
+}
+
/* Bind callback functions for thermal zone */
static int ti_thermal_bind(struct thermal_zone_device *thermal,
struct thermal_cooling_device *cdev)
@@ -293,6 +301,7 @@ int ti_thermal_expose_sensor(struct ti_bandgap *bgp, int id,
char *domain)
{
struct ti_thermal_data *data;
+ struct device_node *np;

data = ti_bandgap_get_sensor_data(bgp, id);

@@ -302,16 +311,23 @@ int ti_thermal_expose_sensor(struct ti_bandgap *bgp, int id,
if (!data)
return -EINVAL;

- /* Create thermal zone */
- data->ti_thermal = thermal_zone_device_register(domain,
+ /* in case this is specified by DT */
+ np = bgp->dev->of_node;
+ data->ti_thermal = thermal_zone_of_device_register(bgp->dev, id,
+ true, /* +hwmon */
+ data, __ti_thermal_get_temp);
+ if (IS_ERR(data->ti_thermal)) {
+ /* Create thermal zone */
+ data->ti_thermal = thermal_zone_device_register(domain,
OMAP_TRIP_NUMBER, 0, data, &ti_thermal_ops,
NULL, FAST_TEMP_MONITORING_RATE,
FAST_TEMP_MONITORING_RATE, true);
- if (IS_ERR(data->ti_thermal)) {
- dev_err(bgp->dev, "thermal zone device is NULL\n");
- return PTR_ERR(data->ti_thermal);
+ if (IS_ERR(data->ti_thermal)) {
+ dev_err(bgp->dev, "thermal zone device is NULL\n");
+ return PTR_ERR(data->ti_thermal);
+ }
+ data->ti_thermal->polling_delay = FAST_TEMP_MONITORING_RATE;
}
- data->ti_thermal->polling_delay = FAST_TEMP_MONITORING_RATE;
ti_bandgap_set_sensor_data(bgp, id, data);

return 0;
@@ -323,7 +339,8 @@ int ti_thermal_remove_sensor(struct ti_bandgap *bgp, int id)

data = ti_bandgap_get_sensor_data(bgp, id);

- thermal_zone_device_unregister(data->ti_thermal);
+ if (data && data->ti_thermal)
+ thermal_zone_device_unregister(data->ti_thermal);

return 0;
}
@@ -342,6 +359,17 @@ int ti_thermal_report_sensor_temperature(struct ti_bandgap *bgp, int id)
int ti_thermal_register_cpu_cooling(struct ti_bandgap *bgp, int id)
{
struct ti_thermal_data *data;
+ struct device_node *np = bgp->dev->of_node;
+
+ /*
+ * We are assuming here that if one deploys the zone
+ * using DT, then it must be aware that the cooling device
+ * loading has to happen via cpufreq driver.
+ */
+ if (of_find_property(np, "monitored-zones", NULL)) {
+ of_node_put(np);
+ return 0;
+ }

data = ti_bandgap_get_sensor_data(bgp, id);
if (!data || IS_ERR(data))
@@ -372,7 +400,9 @@ int ti_thermal_unregister_cpu_cooling(struct ti_bandgap *bgp, int id)
struct ti_thermal_data *data;

data = ti_bandgap_get_sensor_data(bgp, id);
- cpufreq_cooling_unregister(data->cool_dev);
+
+ if (data && data->cool_dev)
+ cpufreq_cooling_unregister(data->cool_dev);

return 0;
}
--
1.8.2.1.342.gfa7285d

2013-08-23 23:18:09

by Eduardo Valentin

[permalink] [raw]
Subject: [RFC PATCH 04/14] hwmon: tmp102: expose to thermal fw via DT nodes

This patch adds to tmp102 temperature sensor the possibility
to expose itself as thermal zone device, registered on the
thermal framework.

The thermal zone is built only if a device tree node
describing a thermal zone for this sensor is present
inside the tmp102 DT node. Otherwise, the driver behavior
will be the same.

Cc: Jean Delvare <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Eduardo Valentin <[email protected]>
---
drivers/hwmon/tmp102.c | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)

diff --git a/drivers/hwmon/tmp102.c b/drivers/hwmon/tmp102.c
index d7b47ab..0f1e44a 100644
--- a/drivers/hwmon/tmp102.c
+++ b/drivers/hwmon/tmp102.c
@@ -27,6 +27,8 @@
#include <linux/mutex.h>
#include <linux/device.h>
#include <linux/jiffies.h>
+#include <linux/thermal.h>
+#include <linux/of.h>

#define DRIVER_NAME "tmp102"

@@ -50,6 +52,7 @@

struct tmp102 {
struct device *hwmon_dev;
+ struct thermal_zone_device *tz;
struct mutex lock;
u16 config_orig;
unsigned long last_update;
@@ -93,6 +96,15 @@ static struct tmp102 *tmp102_update_device(struct i2c_client *client)
return tmp102;
}

+static int tmp102_read_temp(void *dev, unsigned long *temp)
+{
+ struct tmp102 *tmp102 = tmp102_update_device(to_i2c_client(dev));
+
+ *temp = tmp102->temp[0];
+
+ return 0;
+}
+
static ssize_t tmp102_show_temp(struct device *dev,
struct device_attribute *attr,
char *buf)
@@ -204,10 +216,23 @@ static int tmp102_probe(struct i2c_client *client,
goto fail_remove_sysfs;
}

+ if (of_find_property(client->dev.of_node, "monitored-zones", NULL)) {
+ tmp102->tz = thermal_zone_of_device_register(&client->dev, 0,
+ false, /* -hwmon */
+ &client->dev,
+ tmp102_read_temp);
+ if (IS_ERR(tmp102->tz)) {
+ status = PTR_ERR(tmp102->tz);
+ goto exit_hwmon;
+ }
+ }
+
dev_info(&client->dev, "initialized\n");

return 0;

+exit_hwmon:
+ hwmon_device_unregister(tmp102->hwmon_dev);
fail_remove_sysfs:
sysfs_remove_group(&client->dev.kobj, &tmp102_attr_group);
fail_restore_config:
@@ -220,6 +245,7 @@ static int tmp102_remove(struct i2c_client *client)
{
struct tmp102 *tmp102 = i2c_get_clientdata(client);

+ thermal_zone_device_unregister(tmp102->tz);
hwmon_device_unregister(tmp102->hwmon_dev);
sysfs_remove_group(&client->dev.kobj, &tmp102_attr_group);

--
1.8.2.1.342.gfa7285d

2013-08-23 23:18:55

by Eduardo Valentin

[permalink] [raw]
Subject: [RFC PATCH 07/14] arm: dts: add omap4430 thermal data

This patch changes the dtsi entry on omap4430 to contain
the thermal data. This data will enable the passive
cooling with CPUfreq cooling device at 100C and the
system will do a thermal shutdown at 125C.

Cc: "Benoît Cousson" <[email protected]>
Cc: Tony Lindgren <[email protected]>
Cc: Russell King <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Eduardo Valentin <[email protected]>
---
arch/arm/boot/dts/omap443x.dtsi | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/omap443x.dtsi b/arch/arm/boot/dts/omap443x.dtsi
index bcf455e..91a3a5d 100644
--- a/arch/arm/boot/dts/omap443x.dtsi
+++ b/arch/arm/boot/dts/omap443x.dtsi
@@ -25,9 +25,14 @@
};
};

+ thermal_zones{
+ #include "omap4-cpu-thermal.dtsi"
+ };
+
bandgap {
reg = <0x4a002260 0x4
0x4a00232C 0x4>;
compatible = "ti,omap4430-bandgap";
+ monitored-zones = <&cpu_thermal 0>;
};
};
--
1.8.2.1.342.gfa7285d

2013-08-23 23:19:11

by Eduardo Valentin

[permalink] [raw]
Subject: [RFC PATCH 08/14] arm: dts: add omap4460 thermal data

This patch changes the dtsi entry on omap4460 to contain
the thermal data. This data will enable the passive
cooling with CPUfreq cooling device at 100C and the
system will do a thermal shutdown at 125C.

Cc: "Benoît Cousson" <[email protected]>
Cc: Tony Lindgren <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Pawel Moll <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Stephen Warren <[email protected]>
Cc: Ian Campbell <[email protected]>
Cc: Russell King <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Eduardo Valentin <[email protected]>
---
arch/arm/boot/dts/omap4460.dtsi | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/omap4460.dtsi b/arch/arm/boot/dts/omap4460.dtsi
index c2f0f39..9855f29 100644
--- a/arch/arm/boot/dts/omap4460.dtsi
+++ b/arch/arm/boot/dts/omap4460.dtsi
@@ -30,6 +30,10 @@
ti,hwmods = "debugss";
};

+ thermal_zones{
+ #include "omap4-cpu-thermal.dtsi"
+ };
+
bandgap {
reg = <0x4a002260 0x4
0x4a00232C 0x4
@@ -37,5 +41,6 @@
compatible = "ti,omap4460-bandgap";
interrupts = <0 126 IRQ_TYPE_LEVEL_HIGH>; /* talert */
gpios = <&gpio3 22 0>; /* tshut */
+ monitored-zones = <&cpu_thermal 0>;
};
};
--
1.8.2.1.342.gfa7285d

2013-08-23 23:19:26

by Eduardo Valentin

[permalink] [raw]
Subject: [RFC PATCH 09/14] arm: dts: point to cooling-zones on omap4430 cpu node

OMAP4430 devices can reach high temperatures and thus
needs to have cpufreq-cooling on systems running on it.

This patch links the cpu node to its thermal-zone
so that cpufreq-cpu0 driver loads the cooling device.

Cc: "Benoît Cousson" <[email protected]>
Cc: Tony Lindgren <[email protected]>
Cc: Russell King <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Eduardo Valentin <[email protected]>
---
arch/arm/boot/dts/omap443x.dtsi | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/arm/boot/dts/omap443x.dtsi b/arch/arm/boot/dts/omap443x.dtsi
index 91a3a5d..e4eea43 100644
--- a/arch/arm/boot/dts/omap443x.dtsi
+++ b/arch/arm/boot/dts/omap443x.dtsi
@@ -22,6 +22,7 @@
1008000 1375000
>;
clock-latency = <300000>; /* From legacy driver */
+ cooling-zones = <&cpu_thermal>;
};
};

--
1.8.2.1.342.gfa7285d

2013-08-23 23:19:32

by Eduardo Valentin

[permalink] [raw]
Subject: [RFC PATCH 10/14] arm: dts: point to cooling-zones on omap4460 cpu node

OMAP4460 devices can reach high temperatures and thus
needs to have cpufreq-cooling on systems running on it.

This patch links the cpu node to its thermal-zone
so that cpufreq-cpu0 driver loads the cooling device.

Cc: "Benoît Cousson" <[email protected]>
Cc: Tony Lindgren <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Pawel Moll <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Stephen Warren <[email protected]>
Cc: Ian Campbell <[email protected]>
Cc: Russell King <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Eduardo Valentin <[email protected]>
---
arch/arm/boot/dts/omap4460.dtsi | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/arm/boot/dts/omap4460.dtsi b/arch/arm/boot/dts/omap4460.dtsi
index 9855f29..991bd1b 100644
--- a/arch/arm/boot/dts/omap4460.dtsi
+++ b/arch/arm/boot/dts/omap4460.dtsi
@@ -20,6 +20,7 @@
920000 1313000
>;
clock-latency = <300000>; /* From legacy driver */
+ cooling-zones = <&cpu_thermal>;
};
};

--
1.8.2.1.342.gfa7285d

2013-08-23 23:19:44

by Eduardo Valentin

[permalink] [raw]
Subject: [RFC PATCH 11/14] arm: dts: add omap5 GPU thermal data

This patch changes a dtsi file to contain the thermal data
for GPU domain on OMAP5 and later SoCs. This data will
enable a thermal shutdown at 125C.

This thermal data can be reused across TI SoC devices.

Cc: "Benoît Cousson" <[email protected]>
Cc: Tony Lindgren <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Pawel Moll <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Stephen Warren <[email protected]>
Cc: Ian Campbell <[email protected]>
Cc: Russell King <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Eduardo Valentin <[email protected]>
---
arch/arm/boot/dts/omap5-gpu-thermal.dtsi | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)
create mode 100644 arch/arm/boot/dts/omap5-gpu-thermal.dtsi

diff --git a/arch/arm/boot/dts/omap5-gpu-thermal.dtsi b/arch/arm/boot/dts/omap5-gpu-thermal.dtsi
new file mode 100644
index 0000000..f421492
--- /dev/null
+++ b/arch/arm/boot/dts/omap5-gpu-thermal.dtsi
@@ -0,0 +1,27 @@
+
+/*
+ * Device Tree Source for OMAP543x SoC GPU thermal
+ *
+ * Copyright (C) 2013 Texas Instruments Incorporated - http://www.ti.com/
+ * Contact: Eduardo Valentin <[email protected]>
+ *
+ * This file is licensed under the terms of the GNU General Public License
+ * version 2. This program is licensed "as is" without any warranty of any
+ * kind, whether express or implied.
+ */
+
+#include <dt-bindings/thermal/thermal.h>
+
+gpu_thermal: gpu_thermal {
+ #thermal-cells = <1>;
+ mask = <0x01>; /* trips writability */
+ passive_delay = <250>; /* milliseconds */
+ polling_delay = <1000>; /* milliseconds */
+ trips {
+ crit@125000{
+ temperature = <125000>; /* milliCelsius */
+ hysteresis = <2000>; /* milliCelsius */
+ type = <THERMAL_TRIP_CRITICAL>;
+ };
+ };
+};
--
1.8.2.1.342.gfa7285d

2013-08-23 23:20:16

by Eduardo Valentin

[permalink] [raw]
Subject: [RFC PATCH 12/14] arm: dts: add omap5 CORE thermal data

This patch changes a dtsi file to contain the thermal data
for CORE domain on OMAP5 and later SoCs. This data will
enable a thermal shutdown at 125C.

This thermal data can be reused across TI SoC devices.

Cc: "Benoît Cousson" <[email protected]>
Cc: Tony Lindgren <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Pawel Moll <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Stephen Warren <[email protected]>
Cc: Ian Campbell <[email protected]>
Cc: Russell King <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Eduardo Valentin <[email protected]>
---
arch/arm/boot/dts/omap5-core-thermal.dtsi | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)
create mode 100644 arch/arm/boot/dts/omap5-core-thermal.dtsi

diff --git a/arch/arm/boot/dts/omap5-core-thermal.dtsi b/arch/arm/boot/dts/omap5-core-thermal.dtsi
new file mode 100644
index 0000000..3d9fa09
--- /dev/null
+++ b/arch/arm/boot/dts/omap5-core-thermal.dtsi
@@ -0,0 +1,27 @@
+
+/*
+ * Device Tree Source for OMAP543x SoC CORE thermal
+ *
+ * Copyright (C) 2013 Texas Instruments Incorporated - http://www.ti.com/
+ * Contact: Eduardo Valentin <[email protected]>
+ *
+ * This file is licensed under the terms of the GNU General Public License
+ * version 2. This program is licensed "as is" without any warranty of any
+ * kind, whether express or implied.
+ */
+
+#include <dt-bindings/thermal/thermal.h>
+
+core_thermal: core_thermal {
+ #thermal-cells = <1>;
+ mask = <0x01>; /* trips writability */
+ passive_delay = <250>; /* milliseconds */
+ polling_delay = <1000>; /* milliseconds */
+ trips {
+ crit@125000{
+ temperature = <125000>; /* milliCelsius */
+ hysteresis = <2000>; /* milliCelsius */
+ type = <THERMAL_TRIP_CRITICAL>;
+ };
+ };
+};
--
1.8.2.1.342.gfa7285d

2013-08-23 23:21:01

by Eduardo Valentin

[permalink] [raw]
Subject: [RFC PATCH 14/14] arm: dts: point to cooling-zones on omap5 cpu node

OMAP5 devices can reach high temperatures and thus
needs to have cpufreq-cooling on systems running on it.

This patch links the cpu node to its thermal-zone
so that cpufreq-cpu0 driver loads the cooling device.

Cc: "Benoît Cousson" <[email protected]>
Cc: Tony Lindgren <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Pawel Moll <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Stephen Warren <[email protected]>
Cc: Ian Campbell <[email protected]>
Cc: Russell King <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Eduardo Valentin <[email protected]>
---
arch/arm/boot/dts/omap5.dtsi | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/arm/boot/dts/omap5.dtsi b/arch/arm/boot/dts/omap5.dtsi
index 4a33fe0..335fcea 100644
--- a/arch/arm/boot/dts/omap5.dtsi
+++ b/arch/arm/boot/dts/omap5.dtsi
@@ -37,6 +37,7 @@
device_type = "cpu";
compatible = "arm,cortex-a15";
reg = <0x0>;
+ cooling-zones = <&cpu_thermal>;
};
cpu@1 {
device_type = "cpu";
--
1.8.2.1.342.gfa7285d

2013-08-23 23:20:11

by Eduardo Valentin

[permalink] [raw]
Subject: [RFC PATCH 13/14] arm: dts: add omap5 thermal data

This patch changes the dtsi entry on omap5 to contain
the thermal data. This data will enable the passive
cooling with CPUfreq cooling device at 100C. The
system will do a thermal shutdown at 125C whenever
any of its sensors sees this level.

Cc: "Benoît Cousson" <[email protected]>
Cc: Tony Lindgren <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Pawel Moll <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Stephen Warren <[email protected]>
Cc: Ian Campbell <[email protected]>
Cc: Russell King <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Eduardo Valentin <[email protected]>
---
arch/arm/boot/dts/omap5.dtsi | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/arch/arm/boot/dts/omap5.dtsi b/arch/arm/boot/dts/omap5.dtsi
index e643620..4a33fe0 100644
--- a/arch/arm/boot/dts/omap5.dtsi
+++ b/arch/arm/boot/dts/omap5.dtsi
@@ -45,6 +45,12 @@
};
};

+ thermal_zones {
+ #include "omap4-cpu-thermal.dtsi"
+ #include "omap5-gpu-thermal.dtsi"
+ #include "omap5-core-thermal.dtsi"
+ };
+
timer {
compatible = "arm,armv7-timer";
/* PPI secure/nonsecure IRQ */
@@ -711,6 +717,9 @@
0x4a0023C0 0x3c>;
interrupts = <GIC_SPI 126 IRQ_TYPE_LEVEL_HIGH>;
compatible = "ti,omap5430-bandgap";
+ monitored-zones = <&cpu_thermal 0>,
+ <&gpu_thermal 1>,
+ <&core_thermal 2>;
};
};
};
--
1.8.2.1.342.gfa7285d

2013-08-23 23:18:54

by Eduardo Valentin

[permalink] [raw]
Subject: [RFC PATCH 06/14] arm: dts: add omap4 CPU thermal data

This patch changes a dtsi file to contain the thermal data
for MPU domain on OMAP4 and later SoCs. This data will
enable the passive cooling with CPUfreq cooling device
at 100C and the system will do a thermal shutdown at 125C.

This thermal data can be reused across TI SoC devices.

Cc: "Benoît Cousson" <[email protected]>
Cc: Tony Lindgren <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Pawel Moll <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Stephen Warren <[email protected]>
Cc: Ian Campbell <[email protected]>
Cc: Russell King <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Eduardo Valentin <[email protected]>
---
arch/arm/boot/dts/omap4-cpu-thermal.dtsi | 40 ++++++++++++++++++++++++++++++++
1 file changed, 40 insertions(+)
create mode 100644 arch/arm/boot/dts/omap4-cpu-thermal.dtsi

diff --git a/arch/arm/boot/dts/omap4-cpu-thermal.dtsi b/arch/arm/boot/dts/omap4-cpu-thermal.dtsi
new file mode 100644
index 0000000..d74c565
--- /dev/null
+++ b/arch/arm/boot/dts/omap4-cpu-thermal.dtsi
@@ -0,0 +1,40 @@
+
+/*
+ * Device Tree Source for OMAP4/5 SoC CPU thermal
+ *
+ * Copyright (C) 2013 Texas Instruments Incorporated - http://www.ti.com/
+ * Contact: Eduardo Valentin <[email protected]>
+ *
+ * This file is licensed under the terms of the GNU General Public License
+ * version 2. This program is licensed "as is" without any warranty of any
+ * kind, whether express or implied.
+ */
+
+#include <dt-bindings/thermal/thermal.h>
+
+cpu_thermal: cpu_thermal {
+ #thermal-cells = <1>;
+ mask = <0x03>; /* trips writability */
+ passive_delay = <250>; /* milliseconds */
+ polling_delay = <1000>; /* milliseconds */
+ trips {
+ alert@100000{
+ temperature = <100000>; /* milliCelsius */
+ hysteresis = <2000>; /* milliCelsius */
+ type = <THERMAL_TRIP_PASSIVE>;
+ };
+ crit@125000{
+ temperature = <125000>; /* milliCelsius */
+ hysteresis = <2000>; /* milliCelsius */
+ type = <THERMAL_TRIP_CRITICAL>;
+ };
+ };
+ bind_params {
+ action@0{
+ cooling_device = "thermal-cpufreq";
+ weight = <100>; /* percentage */
+ mask = <0x01>;
+ /* no limits, using defaults */
+ };
+ };
+};
--
1.8.2.1.342.gfa7285d

2013-08-23 23:39:41

by Guenter Roeck

[permalink] [raw]
Subject: Re: [RFC PATCH 03/14] hwmon: lm75: expose to thermal fw via DT nodes

On Fri, Aug 23, 2013 at 07:15:44PM -0400, Eduardo Valentin wrote:
> This patch adds to lm75 temperature sensor the possibility
> to expose itself as thermal zone device, registered on the
> thermal framework.
>
> The thermal zone is built only if a device tree node
> describing a thermal zone for this sensor is present
> inside the lm75 DT node. Otherwise, the driver behavior
> will be the same.
>
> Cc: Jean Delvare <[email protected]>
> Cc: Guenter Roeck <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Eduardo Valentin <[email protected]>
> ---
> drivers/hwmon/lm75.c | 31 +++++++++++++++++++++++++++++++
> 1 file changed, 31 insertions(+)
>
> diff --git a/drivers/hwmon/lm75.c b/drivers/hwmon/lm75.c
> index c03b490..dc55908 100644
> --- a/drivers/hwmon/lm75.c
> +++ b/drivers/hwmon/lm75.c
> @@ -27,6 +27,8 @@
> #include <linux/hwmon-sysfs.h>
> #include <linux/err.h>
> #include <linux/mutex.h>
> +#include <linux/thermal.h>
> +#include <linux/of.h>
> #include "lm75.h"
>
>
> @@ -70,6 +72,7 @@ static const u8 LM75_REG_TEMP[3] = {
> /* Each client has this additional data */
> struct lm75_data {
> struct device *hwmon_dev;
> + struct thermal_zone_device *tz;
> struct mutex update_lock;
> u8 orig_conf;
> u8 resolution; /* In bits, between 9 and 12 */
> @@ -92,6 +95,19 @@ static struct lm75_data *lm75_update_device(struct device *dev);
>
> /* sysfs attributes for hwmon */
>
> +static int lm75_read_temp(void *dev, unsigned long *temp)
> +{
> + struct lm75_data *data = lm75_update_device(dev);
> +
> + if (IS_ERR(data))
> + return PTR_ERR(data);
> +
> + *temp = ((data->temp[0] >> (16 - data->resolution)) * 1000) >>
> + (data->resolution - 8);
> +
> + return 0;
> +}
> +
> static ssize_t show_temp(struct device *dev, struct device_attribute *da,
> char *buf)
> {
> @@ -271,11 +287,25 @@ lm75_probe(struct i2c_client *client, const struct i2c_device_id *id)
> goto exit_remove;
> }
>
> + if (of_find_property(client->dev.of_node, "monitored-zones", NULL)) {
> + data->tz = thermal_zone_of_device_register(&client->dev,
> + 0,
> + false, /* -hwmon */
> + &client->dev,
> + lm75_read_temp);
> + if (IS_ERR(data->tz)) {
> + status = PTR_ERR(data->tz);
> + goto exit_hwmon;
> + }
> + }
> +
I don't think it should be fatal if thermal_zone_of_device_register fails.
hwmon itself still works fine, and should not be penaltized for a thermal
subsystem failure. Better display a warning if that happens and don't bail out.

The same comment applies to all patches affecting the hwmon subsystem.

> dev_info(&client->dev, "%s: sensor '%s'\n",
> dev_name(data->hwmon_dev), client->name);
>
> return 0;
>
> +exit_hwmon:
> + hwmon_device_unregister(data->hwmon_dev);
> exit_remove:
> sysfs_remove_group(&client->dev.kobj, &lm75_group);
> return status;
> @@ -285,6 +315,7 @@ static int lm75_remove(struct i2c_client *client)
> {
> struct lm75_data *data = i2c_get_clientdata(client);
>
> + thermal_zone_device_unregister(data->tz);
> hwmon_device_unregister(data->hwmon_dev);
> sysfs_remove_group(&client->dev.kobj, &lm75_group);
> lm75_write_value(client, LM75_REG_CONF, data->orig_conf);
> --
> 1.8.2.1.342.gfa7285d
>
>

2013-08-23 23:50:55

by Guenter Roeck

[permalink] [raw]
Subject: Re: [RFC PATCH 03/14] hwmon: lm75: expose to thermal fw via DT nodes

On Fri, Aug 23, 2013 at 07:15:44PM -0400, Eduardo Valentin wrote:
> This patch adds to lm75 temperature sensor the possibility
> to expose itself as thermal zone device, registered on the
> thermal framework.
>
> The thermal zone is built only if a device tree node
> describing a thermal zone for this sensor is present
> inside the lm75 DT node. Otherwise, the driver behavior
> will be the same.
>
> Cc: Jean Delvare <[email protected]>
> Cc: Guenter Roeck <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Eduardo Valentin <[email protected]>
> ---
> drivers/hwmon/lm75.c | 31 +++++++++++++++++++++++++++++++
> 1 file changed, 31 insertions(+)
>
> diff --git a/drivers/hwmon/lm75.c b/drivers/hwmon/lm75.c
> index c03b490..dc55908 100644
> --- a/drivers/hwmon/lm75.c
> +++ b/drivers/hwmon/lm75.c
> @@ -27,6 +27,8 @@
> #include <linux/hwmon-sysfs.h>
> #include <linux/err.h>
> #include <linux/mutex.h>
> +#include <linux/thermal.h>
> +#include <linux/of.h>
> #include "lm75.h"
>
>
> @@ -70,6 +72,7 @@ static const u8 LM75_REG_TEMP[3] = {
> /* Each client has this additional data */
> struct lm75_data {
> struct device *hwmon_dev;
> + struct thermal_zone_device *tz;
> struct mutex update_lock;
> u8 orig_conf;
> u8 resolution; /* In bits, between 9 and 12 */
> @@ -92,6 +95,19 @@ static struct lm75_data *lm75_update_device(struct device *dev);
>
> /* sysfs attributes for hwmon */
>
> +static int lm75_read_temp(void *dev, unsigned long *temp)
> +{
> + struct lm75_data *data = lm75_update_device(dev);
> +
> + if (IS_ERR(data))
> + return PTR_ERR(data);
> +
> + *temp = ((data->temp[0] >> (16 - data->resolution)) * 1000) >>
> + (data->resolution - 8);
> +
The reported temperature can be negative, which would result in reporting a very
high temperature (note this applies to the tmp102 driver patch as well).

The computation is quite complex and matches the computation done in show_temp().
Given that, it would make sense to declare an inline function to convert the
register value to the temperature. This function can take the register value
and the resolution as argument.

Guenter

> + return 0;
> +}
> +
> static ssize_t show_temp(struct device *dev, struct device_attribute *da,
> char *buf)
> {
> @@ -271,11 +287,25 @@ lm75_probe(struct i2c_client *client, const struct i2c_device_id *id)
> goto exit_remove;
> }
>
> + if (of_find_property(client->dev.of_node, "monitored-zones", NULL)) {
> + data->tz = thermal_zone_of_device_register(&client->dev,
> + 0,
> + false, /* -hwmon */
> + &client->dev,
> + lm75_read_temp);
> + if (IS_ERR(data->tz)) {
> + status = PTR_ERR(data->tz);
> + goto exit_hwmon;
> + }
> + }
> +
> dev_info(&client->dev, "%s: sensor '%s'\n",
> dev_name(data->hwmon_dev), client->name);
>
> return 0;
>
> +exit_hwmon:
> + hwmon_device_unregister(data->hwmon_dev);
> exit_remove:
> sysfs_remove_group(&client->dev.kobj, &lm75_group);
> return status;
> @@ -285,6 +315,7 @@ static int lm75_remove(struct i2c_client *client)
> {
> struct lm75_data *data = i2c_get_clientdata(client);
>
> + thermal_zone_device_unregister(data->tz);
> hwmon_device_unregister(data->hwmon_dev);
> sysfs_remove_group(&client->dev.kobj, &lm75_group);
> lm75_write_value(client, LM75_REG_CONF, data->orig_conf);
> --
> 1.8.2.1.342.gfa7285d
>
>

2013-08-26 04:42:36

by Viresh Kumar

[permalink] [raw]
Subject: Re: [RFC PATCH 01/14] cpufreq: cpufreq-cpu0: add dt node parsing for 'cooling-zones'

On 24 August 2013 04:45, Eduardo Valentin <[email protected]> wrote:
> diff --git a/drivers/cpufreq/cpufreq-cpu0.c b/drivers/cpufreq/cpufreq-cpu0.c
> index ad1fde2..ede6487 100644
> --- a/drivers/cpufreq/cpufreq-cpu0.c
> +++ b/drivers/cpufreq/cpufreq-cpu0.c
> @@ -20,6 +20,9 @@
> #include <linux/platform_device.h>
> #include <linux/regulator/consumer.h>
> #include <linux/slab.h>
> +#include <linux/thermal.h>
> +#include <linux/cpu_cooling.h>
> +#include <linux/cpumask.h>

In alphabetical order please..

> @@ -268,6 +272,13 @@ static int cpu0_cpufreq_probe(struct platform_device *pdev)
> goto out_free_table;
> }
>
> + /*
> + * For now, just loading the cooling device;
> + * thermal DT code takes care of matching them.
> + */
> + if (of_find_property(np, "cooling-zones", NULL))
> + cdev = cpufreq_cooling_register(cpu_present_mask);

Should we check if it passed or failed? And if failed Atleast flag an
appropriate message?

2013-08-26 12:13:01

by Eduardo Valentin

[permalink] [raw]
Subject: Re: [RFC PATCH 03/14] hwmon: lm75: expose to thermal fw via DT nodes

On 23-08-2013 19:39, Guenter Roeck wrote:
> On Fri, Aug 23, 2013 at 07:15:44PM -0400, Eduardo Valentin wrote:
>> This patch adds to lm75 temperature sensor the possibility
>> to expose itself as thermal zone device, registered on the
>> thermal framework.
>>
>> The thermal zone is built only if a device tree node
>> describing a thermal zone for this sensor is present
>> inside the lm75 DT node. Otherwise, the driver behavior
>> will be the same.
>>
>> Cc: Jean Delvare <[email protected]>
>> Cc: Guenter Roeck <[email protected]>
>> Cc: [email protected]
>> Cc: [email protected]
>> Signed-off-by: Eduardo Valentin <[email protected]>
>> ---
>> drivers/hwmon/lm75.c | 31 +++++++++++++++++++++++++++++++
>> 1 file changed, 31 insertions(+)
>>
>> diff --git a/drivers/hwmon/lm75.c b/drivers/hwmon/lm75.c
>> index c03b490..dc55908 100644
>> --- a/drivers/hwmon/lm75.c
>> +++ b/drivers/hwmon/lm75.c
>> @@ -27,6 +27,8 @@
>> #include <linux/hwmon-sysfs.h>
>> #include <linux/err.h>
>> #include <linux/mutex.h>
>> +#include <linux/thermal.h>
>> +#include <linux/of.h>
>> #include "lm75.h"
>>
>>
>> @@ -70,6 +72,7 @@ static const u8 LM75_REG_TEMP[3] = {
>> /* Each client has this additional data */
>> struct lm75_data {
>> struct device *hwmon_dev;
>> + struct thermal_zone_device *tz;
>> struct mutex update_lock;
>> u8 orig_conf;
>> u8 resolution; /* In bits, between 9 and 12 */
>> @@ -92,6 +95,19 @@ static struct lm75_data *lm75_update_device(struct device *dev);
>>
>> /* sysfs attributes for hwmon */
>>
>> +static int lm75_read_temp(void *dev, unsigned long *temp)
>> +{
>> + struct lm75_data *data = lm75_update_device(dev);
>> +
>> + if (IS_ERR(data))
>> + return PTR_ERR(data);
>> +
>> + *temp = ((data->temp[0] >> (16 - data->resolution)) * 1000) >>
>> + (data->resolution - 8);
>> +
>> + return 0;
>> +}
>> +
>> static ssize_t show_temp(struct device *dev, struct device_attribute *da,
>> char *buf)
>> {
>> @@ -271,11 +287,25 @@ lm75_probe(struct i2c_client *client, const struct i2c_device_id *id)
>> goto exit_remove;
>> }
>>
>> + if (of_find_property(client->dev.of_node, "monitored-zones", NULL)) {
>> + data->tz = thermal_zone_of_device_register(&client->dev,
>> + 0,
>> + false, /* -hwmon */
>> + &client->dev,
>> + lm75_read_temp);
>> + if (IS_ERR(data->tz)) {
>> + status = PTR_ERR(data->tz);
>> + goto exit_hwmon;
>> + }
>> + }
>> +
> I don't think it should be fatal if thermal_zone_of_device_register fails.
> hwmon itself still works fine, and should not be penaltized for a thermal
> subsystem failure. Better display a warning if that happens and don't bail out.
>
> The same comment applies to all patches affecting the hwmon subsystem.
>


OK. Sounds reasonable to me. I am going to simply attempt to parse the
DT node, and in case it fails, will log it and continue, as you suggest:
+ data->tz = thermal_zone_of_device_register(&client->dev,
+ 0,
+ false, /* -hwmon */
+ &client->dev,
+ lm75_read_temp);
+ if (IS_ERR(data->tz))
+ dev_warn(&client->dev, "Could not parse device tree thermal data\n");

>> dev_info(&client->dev, "%s: sensor '%s'\n",
>> dev_name(data->hwmon_dev), client->name);
>>
>> return 0;
>>
>> +exit_hwmon:
>> + hwmon_device_unregister(data->hwmon_dev);
>> exit_remove:
>> sysfs_remove_group(&client->dev.kobj, &lm75_group);
>> return status;
>> @@ -285,6 +315,7 @@ static int lm75_remove(struct i2c_client *client)
>> {
>> struct lm75_data *data = i2c_get_clientdata(client);
>>
>> + thermal_zone_device_unregister(data->tz);
>> hwmon_device_unregister(data->hwmon_dev);
>> sysfs_remove_group(&client->dev.kobj, &lm75_group);
>> lm75_write_value(client, LM75_REG_CONF, data->orig_conf);
>> --
>> 1.8.2.1.342.gfa7285d
>>
>>
>
>


--
You have got to be excited about what you are doing. (L. Lamport)

Eduardo Valentin


Attachments:
signature.asc (295.00 B)
OpenPGP digital signature

2013-08-26 12:14:55

by Eduardo Valentin

[permalink] [raw]
Subject: Re: [RFC PATCH 01/14] cpufreq: cpufreq-cpu0: add dt node parsing for 'cooling-zones'

On 26-08-2013 00:42, Viresh Kumar wrote:
> On 24 August 2013 04:45, Eduardo Valentin <[email protected]> wrote:
>> diff --git a/drivers/cpufreq/cpufreq-cpu0.c b/drivers/cpufreq/cpufreq-cpu0.c
>> index ad1fde2..ede6487 100644
>> --- a/drivers/cpufreq/cpufreq-cpu0.c
>> +++ b/drivers/cpufreq/cpufreq-cpu0.c
>> @@ -20,6 +20,9 @@
>> #include <linux/platform_device.h>
>> #include <linux/regulator/consumer.h>
>> #include <linux/slab.h>
>> +#include <linux/thermal.h>
>> +#include <linux/cpu_cooling.h>
>> +#include <linux/cpumask.h>
>
> In alphabetical order please..
>

OK.

>> @@ -268,6 +272,13 @@ static int cpu0_cpufreq_probe(struct platform_device *pdev)
>> goto out_free_table;
>> }
>>
>> + /*
>> + * For now, just loading the cooling device;
>> + * thermal DT code takes care of matching them.
>> + */
>> + if (of_find_property(np, "cooling-zones", NULL))
>> + cdev = cpufreq_cooling_register(cpu_present_mask);
>
> Should we check if it passed or failed? And if failed Atleast flag an
> appropriate message?
>

Yes, we need error checking code. I will add in next version. Thanks Kumar.

>


--
You have got to be excited about what you are doing. (L. Lamport)

Eduardo Valentin


Attachments:
signature.asc (295.00 B)
OpenPGP digital signature

2013-08-26 12:17:11

by Eduardo Valentin

[permalink] [raw]
Subject: Re: [RFC PATCH 03/14] hwmon: lm75: expose to thermal fw via DT nodes

On 23-08-2013 19:50, Guenter Roeck wrote:
> On Fri, Aug 23, 2013 at 07:15:44PM -0400, Eduardo Valentin wrote:
>> This patch adds to lm75 temperature sensor the possibility
>> to expose itself as thermal zone device, registered on the
>> thermal framework.
>>
>> The thermal zone is built only if a device tree node
>> describing a thermal zone for this sensor is present
>> inside the lm75 DT node. Otherwise, the driver behavior
>> will be the same.
>>
>> Cc: Jean Delvare <[email protected]>
>> Cc: Guenter Roeck <[email protected]>
>> Cc: [email protected]
>> Cc: [email protected]
>> Signed-off-by: Eduardo Valentin <[email protected]>
>> ---
>> drivers/hwmon/lm75.c | 31 +++++++++++++++++++++++++++++++
>> 1 file changed, 31 insertions(+)
>>
>> diff --git a/drivers/hwmon/lm75.c b/drivers/hwmon/lm75.c
>> index c03b490..dc55908 100644
>> --- a/drivers/hwmon/lm75.c
>> +++ b/drivers/hwmon/lm75.c
>> @@ -27,6 +27,8 @@
>> #include <linux/hwmon-sysfs.h>
>> #include <linux/err.h>
>> #include <linux/mutex.h>
>> +#include <linux/thermal.h>
>> +#include <linux/of.h>
>> #include "lm75.h"
>>
>>
>> @@ -70,6 +72,7 @@ static const u8 LM75_REG_TEMP[3] = {
>> /* Each client has this additional data */
>> struct lm75_data {
>> struct device *hwmon_dev;
>> + struct thermal_zone_device *tz;
>> struct mutex update_lock;
>> u8 orig_conf;
>> u8 resolution; /* In bits, between 9 and 12 */
>> @@ -92,6 +95,19 @@ static struct lm75_data *lm75_update_device(struct device *dev);
>>
>> /* sysfs attributes for hwmon */
>>
>> +static int lm75_read_temp(void *dev, unsigned long *temp)
>> +{
>> + struct lm75_data *data = lm75_update_device(dev);
>> +
>> + if (IS_ERR(data))
>> + return PTR_ERR(data);
>> +
>> + *temp = ((data->temp[0] >> (16 - data->resolution)) * 1000) >>
>> + (data->resolution - 8);
>> +
> The reported temperature can be negative, which would result in reporting a very
> high temperature (note this applies to the tmp102 driver patch as well).
>

Good point.

> The computation is quite complex and matches the computation done in show_temp().
> Given that, it would make sense to declare an inline function to convert the
> register value to the temperature. This function can take the register value
> and the resolution as argument.

Sounds reasonable. Modifying accordingly in next version.

>
> Guenter
>
>> + return 0;
>> +}
>> +
>> static ssize_t show_temp(struct device *dev, struct device_attribute *da,
>> char *buf)
>> {
>> @@ -271,11 +287,25 @@ lm75_probe(struct i2c_client *client, const struct i2c_device_id *id)
>> goto exit_remove;
>> }
>>
>> + if (of_find_property(client->dev.of_node, "monitored-zones", NULL)) {
>> + data->tz = thermal_zone_of_device_register(&client->dev,
>> + 0,
>> + false, /* -hwmon */
>> + &client->dev,
>> + lm75_read_temp);
>> + if (IS_ERR(data->tz)) {
>> + status = PTR_ERR(data->tz);
>> + goto exit_hwmon;
>> + }
>> + }
>> +
>> dev_info(&client->dev, "%s: sensor '%s'\n",
>> dev_name(data->hwmon_dev), client->name);
>>
>> return 0;
>>
>> +exit_hwmon:
>> + hwmon_device_unregister(data->hwmon_dev);
>> exit_remove:
>> sysfs_remove_group(&client->dev.kobj, &lm75_group);
>> return status;
>> @@ -285,6 +315,7 @@ static int lm75_remove(struct i2c_client *client)
>> {
>> struct lm75_data *data = i2c_get_clientdata(client);
>>
>> + thermal_zone_device_unregister(data->tz);
>> hwmon_device_unregister(data->hwmon_dev);
>> sysfs_remove_group(&client->dev.kobj, &lm75_group);
>> lm75_write_value(client, LM75_REG_CONF, data->orig_conf);
>> --
>> 1.8.2.1.342.gfa7285d
>>
>>
>
>


--
You have got to be excited about what you are doing. (L. Lamport)

Eduardo Valentin


Attachments:
signature.asc (295.00 B)
OpenPGP digital signature

2013-08-27 09:29:37

by Mark Rutland

[permalink] [raw]
Subject: Re: [RFC PATCH 01/14] cpufreq: cpufreq-cpu0: add dt node parsing for 'cooling-zones'

On Sat, Aug 24, 2013 at 12:15:42AM +0100, Eduardo Valentin wrote:
> This patch changes the cpufreq-cpu0 driver to consider if
> a cpu needs cooling (with cpufreq). In case the cooling is needed,
> it can be flagged at the cpu0 device tree node, with the list
> of zones property 'cooling-zones'.
>
> In case this list of zones is present, the driver will
> load a cpufreq cooling device in the system. The cpufreq-cpu0
> driver is not interested in determining how the system should
> be using the cooling device. The driver is responsible
> only of loading the cooling device.
>
> Describing how the cooling device will be used can be
> accomplished by setting up a thermal zone that references
> and is composed by the cpufreq cooling device.
>
> Cc: "Rafael J. Wysocki" <[email protected]>
> Cc: Viresh Kumar <[email protected]>
> Cc: Grant Likely <[email protected]>
> Cc: Rob Herring <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Eduardo Valentin <[email protected]>
> ---
> Documentation/devicetree/bindings/cpufreq/cpufreq-cpu0.txt | 4 ++++
> drivers/cpufreq/cpufreq-cpu0.c | 12 ++++++++++++
> 2 files changed, 16 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/cpufreq/cpufreq-cpu0.txt b/Documentation/devicetree/bindings/cpufreq/cpufreq-cpu0.txt
> index 051f764..add50f7 100644
> --- a/Documentation/devicetree/bindings/cpufreq/cpufreq-cpu0.txt
> +++ b/Documentation/devicetree/bindings/cpufreq/cpufreq-cpu0.txt
> @@ -15,6 +15,9 @@ Optional properties:
> - clock-latency: Specify the possible maximum transition latency for clock,
> in unit of nanoseconds.
> - voltage-tolerance: Specify the CPU voltage tolerance in percentage.
> +- cooling-zones: A list of thermal zones phandles. The generic cpu
> + cooling (freq clipping) is loaded by the generic cpufreq-cpu0 driver
> + in case the device tree node has this list.

Bindings should not describe the behaviour of any kernel (and for that
reason I'm not very keen on the current cpufreq-cpu0 binding document).

Bindings should simply describe the hardware. This addition could
instead be:

- cooling-zones: a list of thermal zone phandles.

However, as the thermal zone binding doesn't seem to have appeared by
this patch, it should get moved later anyway...

Thanks,
Mark.

>
> Examples:
>
> @@ -33,6 +36,7 @@ cpus {
> 198000 850000
> >;
> clock-latency = <61036>; /* two CLK32 periods */
> + cooling-zones = <&cpu_thermal>;
> };
>
> cpu@1 {
> diff --git a/drivers/cpufreq/cpufreq-cpu0.c b/drivers/cpufreq/cpufreq-cpu0.c
> index ad1fde2..ede6487 100644
> --- a/drivers/cpufreq/cpufreq-cpu0.c
> +++ b/drivers/cpufreq/cpufreq-cpu0.c
> @@ -20,6 +20,9 @@
> #include <linux/platform_device.h>
> #include <linux/regulator/consumer.h>
> #include <linux/slab.h>
> +#include <linux/thermal.h>
> +#include <linux/cpu_cooling.h>
> +#include <linux/cpumask.h>
>
> static unsigned int transition_latency;
> static unsigned int voltage_tolerance; /* in percentage */
> @@ -28,6 +31,7 @@ static struct device *cpu_dev;
> static struct clk *cpu_clk;
> static struct regulator *cpu_reg;
> static struct cpufreq_frequency_table *freq_table;
> +static struct thermal_cooling_device *cdev;
>
> static int cpu0_verify_speed(struct cpufreq_policy *policy)
> {
> @@ -268,6 +272,13 @@ static int cpu0_cpufreq_probe(struct platform_device *pdev)
> goto out_free_table;
> }
>
> + /*
> + * For now, just loading the cooling device;
> + * thermal DT code takes care of matching them.
> + */
> + if (of_find_property(np, "cooling-zones", NULL))
> + cdev = cpufreq_cooling_register(cpu_present_mask);
> +
> of_node_put(np);
> of_node_put(parent);
> return 0;
> @@ -283,6 +294,7 @@ out_put_parent:
>
> static int cpu0_cpufreq_remove(struct platform_device *pdev)
> {
> + cpufreq_cooling_unregister(cdev);
> cpufreq_unregister_driver(&cpu0_cpufreq_driver);
> opp_free_cpufreq_table(cpu_dev, &freq_table);
>
> --
> 1.8.2.1.342.gfa7285d
>
>

2013-08-27 10:22:13

by Mark Rutland

[permalink] [raw]
Subject: Re: [RFC PATCH 02/14] drivers: thermal: introduce device tree parser

Hi,

On Sat, Aug 24, 2013 at 12:15:43AM +0100, Eduardo Valentin wrote:
> In order to be able to build thermal policies
> based on generic sensors, like I2C device, that
> can be places in different points on different boards,
> there is a need to have a way to feed board dependent
> data into the thermal framework.
>
> This patch introduces a thermal data parser for device
> tree. The parsed data is used to build thermal zones
> and thermal binding parameters. The output data
> can then be used to deploy thermal policies.
>
> This patch adds also documentation regarding this
> API and how to define tree nodes to use
> this infrastructure.
>
> Cc: Zhang Rui <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Eduardo Valentin <[email protected]>
> ---
> .../devicetree/bindings/thermal/thermal.txt | 160 +++++++
> drivers/thermal/Kconfig | 13 +
> drivers/thermal/Makefile | 1 +
> drivers/thermal/thermal_dt.c | 473 +++++++++++++++++++++
> include/dt-bindings/thermal/thermal.h | 27 ++
> include/linux/thermal.h | 3 +
> 6 files changed, 677 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/thermal/thermal.txt
> create mode 100644 drivers/thermal/thermal_dt.c
> create mode 100644 include/dt-bindings/thermal/thermal.h
>
> diff --git a/Documentation/devicetree/bindings/thermal/thermal.txt b/Documentation/devicetree/bindings/thermal/thermal.txt
> new file mode 100644
> index 0000000..af20ab0
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/thermal/thermal.txt
> @@ -0,0 +1,160 @@
> +----------------------------------------
> +Thermal Framework Device Tree descriptor
> +----------------------------------------
> +
> +This file describes how to define a thermal structure using device tree.
> +A thermal structure includes thermal zones and their components, such
> +as name, trip points, polling intervals and cooling devices binding
> +descriptors. A binding descriptor may contain information on how to
> +react, with a cooling stepped action or a weight on a fair share.
> +The target of device tree thermal descriptors is to describe only
> +the hardware thermal aspects, not how the system must control or which
> +algorithm or policy must take place. It follows a description of each
> +type of device tree node.
> +
> +****
> +trip
> +****
> +
> +The trip node is a node to describe a point in the temperature domain
> +in which the system takes an action. This node describes just the point,
> +not the action.
> +
> +A node describing a trip must contain:
> +- temperature: the trip temperature level, in milliCelsius.
> +- hysteresis: a (low) hysteresis value on 'temperature'. This is a relative
> +value, in milliCelsius.

Presumably this is a single (u32) cell?

> +- type: the trip type. Here is the type mapping:
> + THERMAL_TRIP_ACTIVE
> + THERMAL_TRIP_PASSIVE
> + THERMAL_TRIP_HOT
> + THERMAL_TRIP_CRITICAL

What type is this property?

What are these values? Do you need to refer to a header somewhere?
Symbolic constants should have an associated value for an ABI.

> +
> +**********
> +bind_param
> +**********
> +
> +The bind_param node is a node to describe how cooling devices get assigned
> +to trip points of the zone. The cooling devices are expected to be loaded
> +in the target system.
> +
> +A node describing a bind_param must contain:
> +- cooling_device: A string with the cooling device name.

Please use '-' rather than '_' in bindings. That's the normal devicetree
convention, and there's no point gonig against it and causing more
confusion.

What are valid values of this string, and what do they mean?

Why not a phandle + cells binding to cooling devices?

> +- weight: The 'influence' of a particular cooling device on this zone.
> + This is on a percentage scale. The sum of all these weights
> + (for a particular zone) cannot exceed 100.

This is a bit fuzzy, and certainly needs more guidance.

> +- trip_mask: This is a bit mask that gives the binding relation between
> + this thermal zone and cdev, for a particular trip point.
> + If nth bit is set, then the cdev and thermal zone are bound
> + for trip point n.

What type is this?

The nth bit from which end?

Could you explain what "bound for trip point n" means?

Just because this is a bitmask in Linux does not mean it needs to be so
in the dt.

> +- limits: An array integers consisting of 2-tuples items, and each item means
> + lower and upper state limits, like <min-state max-state>.
> +
> +**************************
> +temperature sensor devices
> +**************************
> +
> +Temperature sensor devices have to be linked to a specific thermal zone.
> +This is done by means of the 'monitored-zones' list.
> +- monitored-zones: A list of thermal zones phandles, identifying which
> +thermal zones the temperature sensor device is used to monitor.

Why does this need to be described that way around?

Can the zone not have a link to the sensor(s) it needs?

> +
> +************
> +thermal_zone
> +************
> +
> +The thermal_zone node is the node containing all the required info
> +for describing a thermal zone, including its cdev bindings. The thermal_zone
> +node must contain, apart from its own properties, one node containing
> +trip nodes and one node containing all the zone bind parameters.
> +
> +Required properties:
> +- mask: Bit string: If 'n'th bit is set, then trip point 'n' is writeable.

What type is a bit string in devicetree? I'm aware of cells and (byte)
strings, but not bit strings.

What does 'writeable' mean in this context?

There's presumably a better name than 'mask'.

> +
> +- passive_delay: number of milliseconds to wait between polls when
> + performing passive cooling.
> +
> +- polling_delay: number of milliseconds to wait between polls when checking.

These sound very much like configuration of the driver rather than
hardware description. What if my temperature sensor can generate an
interrupt at some trigger temperature? Why would I need polling times?

Why does this need to be in the dt at all? Surely the OS can choose some
sensible polling period, possibly dynamically?

> +
> +- #thermal-cells: An integer that is used to specify how many cells compose
> + sensor ids.

I don't see why this is necessary. If the zone itself described its
related sensors rather than the other way around, we wouldn't need it at
all. Most bindings so far have consumers link to their providers rather
than the other way around.

> +
> +Below is an example:
> +cpu_thermal: thermal_zone {
> + #thermal-cells = <1>;
> + mask = <0x03>; /* trips writability */
> + passive_delay = <250>; /* milliseconds */
> + polling_delay = <1000>; /* milliseconds */
> + trips {
> + alert@100000{
> + temperature = <100000>; /* milliCelsius */
> + hysteresis = <0>; /* milliCelsius */
> + type = <THERMAL_TRIP_PASSIVE>;
> + };
> + crit@125000{
> + temperature = <125000>; /* milliCelsius */
> + hysteresis = <0>; /* milliCelsius */
> + type = <THERMAL_TRIP_CRITICAL>;
> + };
> + };
> + bind_params {
> + action@0{
> + cooling_device = "thermal-cpufreq";

NAK. That value was not defined anywhere, and represents Linux
infrastructure rather than hardware.

For this case, we could just as easily describe the thermal points, and
if no physical cooling device (e.g. a fan) is present, assume
cpufreq-base cooling. Other OSs could choose to do otherwise, and we
could change later...

> + weight = <100>; /* percentage */
> + mask = <0x01>;

This should presumably be trip-mask (assuming my suggested corrections).

What is it supposed to mean here?

> + limits = <
> + /* lower-state upper-state */
> + 1 THERMAL_NO_LIMITS
> + THERMAL_NO_LIMITS THERMAL_NO_LIMITS
> + >;
> + };
> + };
> +};
> +
> +sensor: adc@0xFFFF {
> + ...
> + monitored-zones = <&cpu_thermal 0>;
> +};
> +
> +cpu@0 {
> + ...
> + cooling-zones = <&cpu_thermal>;
> +};
> +
> +In the example above, the ADC sensor at address 0xFFFF is used to monitor
> +the zone 'cpu_thermal' using its the sensor 0. The cpu@0 device is also linked
> +to the same thermal zone 'cpu_thermal' as a cooling device.

Huh? That seems to imply the the '0' in '<&cpu_thermal 0>' on the sensor
node describes a property of the sensor rather than the thermal node,
unless I've misunderstood?

> +
> +***************
> +cpufreq-cooling
> +***************
> +
> +The generic cpu cooling (freq clipping) can be loaded by the
> +generic cpufreq-cpu0 driver in case the device tree node informs
> +this need.
> +
> +In order to load the cpufreq cooling device, it is possible to
> +inform that the cpu needs cooling by adding the list of thermal zones
> +in the 'cooling-zones' property at the cpu0 phandle.

This should be reworded so as to describe the binding rather than the
Linux behaviour based on the binding.

> +
> +Example:
> + cpus {
> + cpu@0 {
> + operating-points = <
> + /* kHz uV */
> + 800000 1313000
> + 1008000 1375000
> + >;
> + cooling-zones = <&cpu_thermal>;
> + };
> + ...
> + };
> +
> +The above will cause the cpufreq-cpu0 driver to understand that
> +the cpu will need passive cooling and while probing that node it will
> +also load the cpufreq cpu cooling device in that system.

The above describe that the CPU is part of a cooling (thermal?) zone and
requires cooling.

> +
> +However, be advised that this flag will not describe what needs to
> +be done or how the cpufreq cooling device will be used. In order to
> +accomplish this, one needs to write a phandle for a thermal zone, as
> +described in the section 'thermal_zone'.

The above already has a phandle to a thermal (cooling?) zone.

Please choose either cooling zone or thermal zone, having the two names
makes this more confusing than necessary.

> diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
> index 7fb16bc..753f0dc 100644
> --- a/drivers/thermal/Kconfig
> +++ b/drivers/thermal/Kconfig
> @@ -29,6 +29,19 @@ config THERMAL_HWMON
> Say 'Y' here if you want all thermal sensors to
> have hwmon sysfs interface too.
>
> +config THERMAL_OF
> + bool
> + prompt "APIs to parse thermal data out of device tree"
> + depends on OF
> + default y
> + help
> + This options provides helpers to add the support to
> + read and parse thermal data definitions out of the
> + device tree blob.
> +
> + Say 'Y' here if you need to build thermal infrastructure
> + based on device tree.
> +
> choice
> prompt "Default Thermal governor"
> default THERMAL_DEFAULT_GOV_STEP_WISE
> diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
> index 24cb894..eedb273 100644
> --- a/drivers/thermal/Makefile
> +++ b/drivers/thermal/Makefile
> @@ -7,6 +7,7 @@ thermal_sys-y += thermal_core.o
>
> # interface to/from other layers providing sensors
> thermal_sys-$(CONFIG_THERMAL_HWMON) += thermal_hwmon.o
> +thermal_sys-$(CONFIG_THERMAL_OF) += thermal_dt.o
>
> # governors
> thermal_sys-$(CONFIG_THERMAL_GOV_FAIR_SHARE) += fair_share.o
> diff --git a/drivers/thermal/thermal_dt.c b/drivers/thermal/thermal_dt.c
> new file mode 100644
> index 0000000..cc35485
> --- /dev/null
> +++ b/drivers/thermal/thermal_dt.c
> @@ -0,0 +1,473 @@
> +/*
> + * thermal_dt.c - Generic Thermal Management device tree support.
> + *
> + * Copyright (C) 2013 Texas Instruments
> + * Copyright (C) 2013 Eduardo Valentin <[email protected]>
> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; version 2 of the License.
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + * General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, write to the Free Software Foundation, Inc.,
> + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
> + *
> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> + */
> +#include <linux/thermal.h>
> +#include <linux/slab.h>
> +#include <linux/types.h>
> +#include <linux/of_device.h>
> +#include <linux/of_platform.h>
> +#include <linux/err.h>
> +#include <linux/export.h>
> +#include <linux/string.h>
> +
> +struct __thermal_bind_params {
> + char cooling_device[THERMAL_NAME_LENGTH];
> +};
> +
> +static
> +int thermal_of_match(struct thermal_zone_device *tz,
> + struct thermal_cooling_device *cdev);
> +
> +static int thermal_of_populate_bind_params(struct device *dev,
> + struct device_node *node,
> + struct __thermal_bind_params *__tbp,
> + struct thermal_bind_params *tbp,
> + int ntrips)
> +{
> + const char *cdev_name;
> + u32 *limits;
> + int ret;
> + u32 prop;
> +
> + ret = of_property_read_u32(node, "weight", &prop);
> + if (ret < 0) {
> + dev_err(dev, "missing weight property\n");
> + return ret;
> + }
> + tbp->weight = prop;
> +
> + ret = of_property_read_u32(node, "mask", &prop);
> + if (ret < 0) {
> + dev_err(dev, "missing mask property\n");
> + return ret;
> + }
> + tbp->trip_mask = prop;
> +
> + /* this will allow us to bind with cooling devices */
> + tbp->match = thermal_of_match;
> +
> + ret = of_property_read_string(node, "cooling_device", &cdev_name);
> + if (ret < 0) {
> + dev_err(dev, "missing cooling_device property\n");
> + return ret;
> + }
> + strncpy(__tbp->cooling_device, cdev_name,
> + sizeof(__tbp->cooling_device));
> +
> + limits = kzalloc(ntrips * sizeof(*limits) * 2, GFP_KERNEL);

Why do you use kzalloc here, but devm_kzalloc elsewhere? That'll lead to
problems as the driver gets refactored over time.

> + if (!limits) {
> + dev_err(dev, "not enough mem for reading limits\n");
> + return -ENOMEM;
> + }
> + ret = of_property_read_u32_array(node, "limits", limits, 2 * ntrips);
> + if (!ret) {
> + int i = ntrips;
> +
> + tbp->binding_limits = devm_kzalloc(dev,
> + ntrips * 2 *
> + sizeof(*tbp->binding_limits),
> + GFP_KERNEL);
> + if (!tbp->binding_limits) {
> + dev_err(dev, "not enough mem for storing limits\n");
> + return -ENOMEM;
> + }
> + while (i--)
> + tbp->binding_limits[i] = limits[i];
> + }
> + kfree(limits);
> +
> + return 0;

What if you failed to read the array? Surely you should return an error?

> +}
> +
> +struct __thermal_trip {
> + unsigned long int temperature;
> + unsigned long int hysteresis;
> + enum thermal_trip_type type;
> +};
> +
> +static
> +int thermal_of_populate_trip(struct device *dev,
> + struct device_node *node,
> + struct __thermal_trip *trip)
> +{
> + int ret;
> + int prop;
> +
> + ret = of_property_read_u32(node, "temperature", &prop);
> + if (ret < 0) {
> + dev_err(dev, "missing temperature property\n");
> + return ret;
> + }
> + trip->temperature = prop;
> +
> + ret = of_property_read_u32(node, "hysteresis", &prop);
> + if (ret < 0) {
> + dev_err(dev, "missing hysteresis property\n");
> + return ret;
> + }
> + trip->hysteresis = prop;
> +
> + ret = of_property_read_u32(node, "type", &prop);
> + if (ret < 0) {
> + dev_err(dev, "missing type property\n");
> + return ret;
> + }
> + trip->type = prop;

This will require sanity checking.

> +
> + return 0;
> +}
> +
> +struct __thermal_zone_device {
> + enum thermal_device_mode mode;
> + int passive_delay;
> + int polling_delay;
> + int mask;
> + int ntrips;
> + char type[THERMAL_NAME_LENGTH];
> + struct __thermal_trip *trips;
> + struct __thermal_bind_params *bind_params;
> + struct thermal_bind_params *tbps;
> + struct thermal_zone_params zone_params;
> + int (*get_temp)(void *, unsigned long *);
> + void *devdata;
> +};
> +
> +static struct __thermal_zone_device *
> +thermal_of_build_thermal_zone(struct device *dev, struct device_node *node)
> +{
> + struct device_node *child, *gchild;
> + struct __thermal_zone_device *tz;
> + int ret, i;
> + u32 prop;
> +
> + if (!node) {
> + dev_err(dev, "no thermal zone node\n");
> + return ERR_PTR(-EINVAL);
> + }
> +
> + tz = devm_kzalloc(dev, sizeof(*tz), GFP_KERNEL);
> + if (!tz) {
> + dev_err(dev, "not enough memory for thermal of zone\n");
> + return ERR_PTR(-ENOMEM);
> + }
> +
> + ret = of_property_read_u32(node, "passive_delay", &prop);
> + if (ret < 0) {
> + dev_err(dev, "missing passive_delay property\n");
> + return ERR_PTR(ret);
> + }
> + tz->passive_delay = prop;
> +
> + ret = of_property_read_u32(node, "polling_delay", &prop);
> + if (ret < 0) {
> + dev_err(dev, "missing polling_delay property\n");
> + return ERR_PTR(ret);
> + }
> + tz->polling_delay = prop;
> +
> + ret = of_property_read_u32(node, "mask", &prop);
> + if (ret < 0) {
> + dev_err(dev, "missing mask property\n");
> + return ERR_PTR(ret);
> + }
> + tz->mask = prop;
> +
> + /* assume node name as our zone type */
> + strncpy(tz->type, node->name, sizeof(tz->type));
> +
> + /* default policy */
> + strncpy(tz->zone_params.governor_name, "step_wise",
> + sizeof(tz->zone_params.governor_name));
> +
> + /* trips */
> + child = of_find_node_by_name(node, "trips");

That might not be a child node, as of_find_node_by_name walks over the
list of all nodes. Use of_get_child_by_name.

> +
> + /* No trips provided */
> + if (!child)
> + goto finish;
> +
> + tz->ntrips = of_get_child_count(child);
> + tz->trips = devm_kzalloc(dev, tz->ntrips * sizeof(*tz->trips),
> + GFP_KERNEL);
> + if (!tz->trips)
> + return ERR_PTR(-ENOMEM);
> + i = 0;
> + for_each_child_of_node(child, gchild)
> + thermal_of_populate_trip(dev, gchild, &tz->trips[i++]);

What if a child node was not a valid trip node? It seems
thermal_of_populate_trip returns errors you're ignoring.

> +
> + /* bind_params */
> + child = of_find_node_by_name(node, "bind_params");

Another of_get_child_by_name candidate.

> +
> + /* No binding info */
> + if (!child)
> + goto finish;
> +
> + tz->zone_params.num_tbps = of_get_child_count(child);
> + tz->bind_params = devm_kzalloc(dev,
> + tz->zone_params.num_tbps *
> + sizeof(*tz->bind_params),
> + GFP_KERNEL);
> + if (!tz->bind_params)
> + return ERR_PTR(-ENOMEM);
> + tz->zone_params.tbp = devm_kzalloc(dev,
> + tz->zone_params.num_tbps *
> + sizeof(*tz->zone_params.tbp),
> + GFP_KERNEL);
> + if (!tz->zone_params.tbp)
> + return ERR_PTR(-ENOMEM);
> + i = 0;
> + for_each_child_of_node(child, gchild) {
> + thermal_of_populate_bind_params(dev, gchild,
> + &tz->bind_params[i],
> + &tz->zone_params.tbp[i],
> + tz->ntrips);

Similarly, you're ignoring error conditions here. What if a bind_params
node was malformed?

> + i++;
> + }
> +
> +finish:
> + tz->mode = THERMAL_DEVICE_ENABLED;
> +
> + return tz;
> +}
> +
> +static
> +int thermal_of_match(struct thermal_zone_device *tz,
> + struct thermal_cooling_device *cdev)
> +{
> + struct __thermal_zone_device *data = tz->devdata;
> + int i;
> +
> + for (i = 0; i < data->zone_params.num_tbps; i++) {
> + if (!strncmp(data->bind_params[i].cooling_device,
> + cdev->type,
> + strlen(data->bind_params[i].cooling_device)) &&
> + (data->zone_params.tbp[i].trip_mask & (1 << i)))
> + return 0;
> + }
> +
> + return -EINVAL;
> +}

I'm not keen on binding to cooling devices using a string, as suddenly a
name to allow humans to inspect the system is turned into an ABI.

> +
> +static
> +int of_thermal_get_temp(struct thermal_zone_device *tz,
> + unsigned long *temp)
> +{
> + struct __thermal_zone_device *data = tz->devdata;
> +
> + return data->get_temp(data->devdata, temp);
> +}
> +
> +static
> +int of_thermal_get_mode(struct thermal_zone_device *tz,
> + enum thermal_device_mode *mode)
> +{
> + struct __thermal_zone_device *data = tz->devdata;
> +
> + *mode = data->mode;
> +
> + return 0;
> +}
> +
> +static
> +int of_thermal_set_mode(struct thermal_zone_device *tz,
> + enum thermal_device_mode mode)
> +{
> + struct __thermal_zone_device *data = tz->devdata;
> +
> + mutex_lock(&tz->lock);
> +
> + if (mode == THERMAL_DEVICE_ENABLED)
> + tz->polling_delay = data->polling_delay;
> + else
> + tz->polling_delay = 0;
> +
> + mutex_unlock(&tz->lock);
> +
> + data->mode = mode;
> + thermal_zone_device_update(tz);
> +
> + return 0;
> +}
> +
> +static
> +int of_thermal_get_trip_type(struct thermal_zone_device *tz, int trip,
> + enum thermal_trip_type *type)
> +{
> + struct __thermal_zone_device *data = tz->devdata;
> +
> + if (trip >= data->ntrips || trip < 0)
> + return -EDOM;

It's far more common to use -EINVAL.

> +
> + *type = data->trips[trip].type;
> +
> + return 0;
> +}
> +
> +static
> +int of_thermal_get_trip_temp(struct thermal_zone_device *tz, int trip,
> + unsigned long *temp)
> +{
> + struct __thermal_zone_device *data = tz->devdata;
> +
> + if (trip >= data->ntrips || trip < 0)
> + return -EDOM;
> +
> + *temp = data->trips[trip].temperature;
> +
> + return 0;
> +}
> +
> +static
> +int of_thermal_set_trip_temp(struct thermal_zone_device *tz, int trip,
> + unsigned long temp)
> +{
> + struct __thermal_zone_device *data = tz->devdata;
> +
> + if (trip >= data->ntrips || trip < 0)
> + return -EDOM;
> +
> + /* thermal fw should take care of data->mask & (1 << trip) */
> + data->trips[trip].temperature = temp;
> +
> + return 0;
> +}
> +
> +static
> +int of_thermal_get_trip_hyst(struct thermal_zone_device *tz, int trip,
> + unsigned long *hyst)
> +{
> + struct __thermal_zone_device *data = tz->devdata;
> +
> + if (trip >= data->ntrips || trip < 0)
> + return -EDOM;
> +
> + *hyst = data->trips[trip].hysteresis;
> +
> + return 0;
> +}
> +
> +static
> +int of_thermal_set_trip_hyst(struct thermal_zone_device *tz, int trip,
> + unsigned long hyst)
> +{
> + struct __thermal_zone_device *data = tz->devdata;
> +
> + if (trip >= data->ntrips || trip < 0)
> + return -EDOM;
> +
> + /* thermal fw should take care of data->mask & (1 << trip) */

Could you elaborate on that comment?

> + data->trips[trip].hysteresis = hyst;
> +
> + return 0;
> +}

[...]

Thanks,
Mark.

2013-08-27 10:26:13

by Mark Rutland

[permalink] [raw]
Subject: Re: [RFC PATCH 03/14] hwmon: lm75: expose to thermal fw via DT nodes

On Sat, Aug 24, 2013 at 12:15:44AM +0100, Eduardo Valentin wrote:
> This patch adds to lm75 temperature sensor the possibility
> to expose itself as thermal zone device, registered on the
> thermal framework.
>
> The thermal zone is built only if a device tree node
> describing a thermal zone for this sensor is present
> inside the lm75 DT node. Otherwise, the driver behavior
> will be the same.

There'll need to be a binding document if you're extending the lm75
binding. See the comment at the top of [1]. However, I don't think you
need this at all, as I explain in my comments on the previous patch.

Thanks,
Mark.

[1] Documentation/devicetree/bindings/i2c/trivial-devices.txt

>
> Cc: Jean Delvare <[email protected]>
> Cc: Guenter Roeck <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Eduardo Valentin <[email protected]>
> ---
> drivers/hwmon/lm75.c | 31 +++++++++++++++++++++++++++++++
> 1 file changed, 31 insertions(+)
>
> diff --git a/drivers/hwmon/lm75.c b/drivers/hwmon/lm75.c
> index c03b490..dc55908 100644
> --- a/drivers/hwmon/lm75.c
> +++ b/drivers/hwmon/lm75.c
> @@ -27,6 +27,8 @@
> #include <linux/hwmon-sysfs.h>
> #include <linux/err.h>
> #include <linux/mutex.h>
> +#include <linux/thermal.h>
> +#include <linux/of.h>
> #include "lm75.h"
>
>
> @@ -70,6 +72,7 @@ static const u8 LM75_REG_TEMP[3] = {
> /* Each client has this additional data */
> struct lm75_data {
> struct device *hwmon_dev;
> + struct thermal_zone_device *tz;
> struct mutex update_lock;
> u8 orig_conf;
> u8 resolution; /* In bits, between 9 and 12 */
> @@ -92,6 +95,19 @@ static struct lm75_data *lm75_update_device(struct device *dev);
>
> /* sysfs attributes for hwmon */
>
> +static int lm75_read_temp(void *dev, unsigned long *temp)
> +{
> + struct lm75_data *data = lm75_update_device(dev);
> +
> + if (IS_ERR(data))
> + return PTR_ERR(data);
> +
> + *temp = ((data->temp[0] >> (16 - data->resolution)) * 1000) >>
> + (data->resolution - 8);
> +
> + return 0;
> +}
> +
> static ssize_t show_temp(struct device *dev, struct device_attribute *da,
> char *buf)
> {
> @@ -271,11 +287,25 @@ lm75_probe(struct i2c_client *client, const struct i2c_device_id *id)
> goto exit_remove;
> }
>
> + if (of_find_property(client->dev.of_node, "monitored-zones", NULL)) {
> + data->tz = thermal_zone_of_device_register(&client->dev,
> + 0,
> + false, /* -hwmon */
> + &client->dev,
> + lm75_read_temp);
> + if (IS_ERR(data->tz)) {
> + status = PTR_ERR(data->tz);
> + goto exit_hwmon;
> + }
> + }
> +
> dev_info(&client->dev, "%s: sensor '%s'\n",
> dev_name(data->hwmon_dev), client->name);
>
> return 0;
>
> +exit_hwmon:
> + hwmon_device_unregister(data->hwmon_dev);
> exit_remove:
> sysfs_remove_group(&client->dev.kobj, &lm75_group);
> return status;
> @@ -285,6 +315,7 @@ static int lm75_remove(struct i2c_client *client)
> {
> struct lm75_data *data = i2c_get_clientdata(client);
>
> + thermal_zone_device_unregister(data->tz);
> hwmon_device_unregister(data->hwmon_dev);
> sysfs_remove_group(&client->dev.kobj, &lm75_group);
> lm75_write_value(client, LM75_REG_CONF, data->orig_conf);
> --
> 1.8.2.1.342.gfa7285d
>
>

2013-08-27 10:27:13

by Mark Rutland

[permalink] [raw]
Subject: Re: [RFC PATCH 04/14] hwmon: tmp102: expose to thermal fw via DT nodes

On Sat, Aug 24, 2013 at 12:15:45AM +0100, Eduardo Valentin wrote:
> This patch adds to tmp102 temperature sensor the possibility
> to expose itself as thermal zone device, registered on the
> thermal framework.
>
> The thermal zone is built only if a device tree node
> describing a thermal zone for this sensor is present
> inside the tmp102 DT node. Otherwise, the driver behavior
> will be the same.

Similarly to the lm75, this requires a binding document update, but I
don't think the binding extension's necessary anyway.

Thanks,
Mark.

>
> Cc: Jean Delvare <[email protected]>
> Cc: Guenter Roeck <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Eduardo Valentin <[email protected]>
> ---
> drivers/hwmon/tmp102.c | 26 ++++++++++++++++++++++++++
> 1 file changed, 26 insertions(+)
>
> diff --git a/drivers/hwmon/tmp102.c b/drivers/hwmon/tmp102.c
> index d7b47ab..0f1e44a 100644
> --- a/drivers/hwmon/tmp102.c
> +++ b/drivers/hwmon/tmp102.c
> @@ -27,6 +27,8 @@
> #include <linux/mutex.h>
> #include <linux/device.h>
> #include <linux/jiffies.h>
> +#include <linux/thermal.h>
> +#include <linux/of.h>
>
> #define DRIVER_NAME "tmp102"
>
> @@ -50,6 +52,7 @@
>
> struct tmp102 {
> struct device *hwmon_dev;
> + struct thermal_zone_device *tz;
> struct mutex lock;
> u16 config_orig;
> unsigned long last_update;
> @@ -93,6 +96,15 @@ static struct tmp102 *tmp102_update_device(struct i2c_client *client)
> return tmp102;
> }
>
> +static int tmp102_read_temp(void *dev, unsigned long *temp)
> +{
> + struct tmp102 *tmp102 = tmp102_update_device(to_i2c_client(dev));
> +
> + *temp = tmp102->temp[0];
> +
> + return 0;
> +}
> +
> static ssize_t tmp102_show_temp(struct device *dev,
> struct device_attribute *attr,
> char *buf)
> @@ -204,10 +216,23 @@ static int tmp102_probe(struct i2c_client *client,
> goto fail_remove_sysfs;
> }
>
> + if (of_find_property(client->dev.of_node, "monitored-zones", NULL)) {
> + tmp102->tz = thermal_zone_of_device_register(&client->dev, 0,
> + false, /* -hwmon */
> + &client->dev,
> + tmp102_read_temp);
> + if (IS_ERR(tmp102->tz)) {
> + status = PTR_ERR(tmp102->tz);
> + goto exit_hwmon;
> + }
> + }
> +
> dev_info(&client->dev, "initialized\n");
>
> return 0;
>
> +exit_hwmon:
> + hwmon_device_unregister(tmp102->hwmon_dev);
> fail_remove_sysfs:
> sysfs_remove_group(&client->dev.kobj, &tmp102_attr_group);
> fail_restore_config:
> @@ -220,6 +245,7 @@ static int tmp102_remove(struct i2c_client *client)
> {
> struct tmp102 *tmp102 = i2c_get_clientdata(client);
>
> + thermal_zone_device_unregister(tmp102->tz);
> hwmon_device_unregister(tmp102->hwmon_dev);
> sysfs_remove_group(&client->dev.kobj, &tmp102_attr_group);
>
> --
> 1.8.2.1.342.gfa7285d
>
>

2013-08-27 13:07:16

by Eduardo Valentin

[permalink] [raw]
Subject: Re: [RFC PATCH 01/14] cpufreq: cpufreq-cpu0: add dt node parsing for 'cooling-zones'

Hey Mark,

On 27-08-2013 05:29, Mark Rutland wrote:
> On Sat, Aug 24, 2013 at 12:15:42AM +0100, Eduardo Valentin wrote:
>> This patch changes the cpufreq-cpu0 driver to consider if
>> a cpu needs cooling (with cpufreq). In case the cooling is needed,
>> it can be flagged at the cpu0 device tree node, with the list
>> of zones property 'cooling-zones'.
>>
>> In case this list of zones is present, the driver will
>> load a cpufreq cooling device in the system. The cpufreq-cpu0
>> driver is not interested in determining how the system should
>> be using the cooling device. The driver is responsible
>> only of loading the cooling device.
>>
>> Describing how the cooling device will be used can be
>> accomplished by setting up a thermal zone that references
>> and is composed by the cpufreq cooling device.
>>
>> Cc: "Rafael J. Wysocki" <[email protected]>
>> Cc: Viresh Kumar <[email protected]>
>> Cc: Grant Likely <[email protected]>
>> Cc: Rob Herring <[email protected]>
>> Cc: [email protected]
>> Cc: [email protected]
>> Cc: [email protected]
>> Cc: [email protected]
>> Signed-off-by: Eduardo Valentin <[email protected]>
>> ---
>> Documentation/devicetree/bindings/cpufreq/cpufreq-cpu0.txt | 4 ++++
>> drivers/cpufreq/cpufreq-cpu0.c | 12 ++++++++++++
>> 2 files changed, 16 insertions(+)
>>
>> diff --git a/Documentation/devicetree/bindings/cpufreq/cpufreq-cpu0.txt b/Documentation/devicetree/bindings/cpufreq/cpufreq-cpu0.txt
>> index 051f764..add50f7 100644
>> --- a/Documentation/devicetree/bindings/cpufreq/cpufreq-cpu0.txt
>> +++ b/Documentation/devicetree/bindings/cpufreq/cpufreq-cpu0.txt
>> @@ -15,6 +15,9 @@ Optional properties:
>> - clock-latency: Specify the possible maximum transition latency for clock,
>> in unit of nanoseconds.
>> - voltage-tolerance: Specify the CPU voltage tolerance in percentage.
>> +- cooling-zones: A list of thermal zones phandles. The generic cpu
>> + cooling (freq clipping) is loaded by the generic cpufreq-cpu0 driver
>> + in case the device tree node has this list.
>
> Bindings should not describe the behaviour of any kernel (and for that
> reason I'm not very keen on the current cpufreq-cpu0 binding document).
>

Right.

> Bindings should simply describe the hardware. This addition could
> instead be:
>
> - cooling-zones: a list of thermal zone phandles.
>

OK. This works for me.

> However, as the thermal zone binding doesn't seem to have appeared by
> this patch, it should get moved later anyway...

Yeah, I will reorder them.

>
> Thanks,
> Mark.
>
>>
>> Examples:
>>
>> @@ -33,6 +36,7 @@ cpus {
>> 198000 850000
>> >;
>> clock-latency = <61036>; /* two CLK32 periods */
>> + cooling-zones = <&cpu_thermal>;
>> };
>>
>> cpu@1 {
>> diff --git a/drivers/cpufreq/cpufreq-cpu0.c b/drivers/cpufreq/cpufreq-cpu0.c
>> index ad1fde2..ede6487 100644
>> --- a/drivers/cpufreq/cpufreq-cpu0.c
>> +++ b/drivers/cpufreq/cpufreq-cpu0.c
>> @@ -20,6 +20,9 @@
>> #include <linux/platform_device.h>
>> #include <linux/regulator/consumer.h>
>> #include <linux/slab.h>
>> +#include <linux/thermal.h>
>> +#include <linux/cpu_cooling.h>
>> +#include <linux/cpumask.h>
>>
>> static unsigned int transition_latency;
>> static unsigned int voltage_tolerance; /* in percentage */
>> @@ -28,6 +31,7 @@ static struct device *cpu_dev;
>> static struct clk *cpu_clk;
>> static struct regulator *cpu_reg;
>> static struct cpufreq_frequency_table *freq_table;
>> +static struct thermal_cooling_device *cdev;
>>
>> static int cpu0_verify_speed(struct cpufreq_policy *policy)
>> {
>> @@ -268,6 +272,13 @@ static int cpu0_cpufreq_probe(struct platform_device *pdev)
>> goto out_free_table;
>> }
>>
>> + /*
>> + * For now, just loading the cooling device;
>> + * thermal DT code takes care of matching them.
>> + */
>> + if (of_find_property(np, "cooling-zones", NULL))
>> + cdev = cpufreq_cooling_register(cpu_present_mask);
>> +
>> of_node_put(np);
>> of_node_put(parent);
>> return 0;
>> @@ -283,6 +294,7 @@ out_put_parent:
>>
>> static int cpu0_cpufreq_remove(struct platform_device *pdev)
>> {
>> + cpufreq_cooling_unregister(cdev);
>> cpufreq_unregister_driver(&cpu0_cpufreq_driver);
>> opp_free_cpufreq_table(cpu_dev, &freq_table);
>>
>> --
>> 1.8.2.1.342.gfa7285d
>>
>>
>
>


--
You have got to be excited about what you are doing. (L. Lamport)

Eduardo Valentin


Attachments:
signature.asc (295.00 B)
OpenPGP digital signature

2013-08-27 13:46:00

by Eduardo Valentin

[permalink] [raw]
Subject: Re: [RFC PATCH 02/14] drivers: thermal: introduce device tree parser

Hello Mark,


First of all, thanks for taking the time to review in such level of
detail. Lets try to align and have a common understanding. Answers go
inline. Please let me know if I missed something or if I made it more
confusing :-)

On 27-08-2013 06:22, Mark Rutland wrote:
> Hi,
>
> On Sat, Aug 24, 2013 at 12:15:43AM +0100, Eduardo Valentin wrote:
>> In order to be able to build thermal policies
>> based on generic sensors, like I2C device, that
>> can be places in different points on different boards,
>> there is a need to have a way to feed board dependent
>> data into the thermal framework.
>>
>> This patch introduces a thermal data parser for device
>> tree. The parsed data is used to build thermal zones
>> and thermal binding parameters. The output data
>> can then be used to deploy thermal policies.
>>
>> This patch adds also documentation regarding this
>> API and how to define tree nodes to use
>> this infrastructure.
>>
>> Cc: Zhang Rui <[email protected]>
>> Cc: [email protected]
>> Cc: [email protected]
>> Signed-off-by: Eduardo Valentin <[email protected]>
>> ---
>> .../devicetree/bindings/thermal/thermal.txt | 160 +++++++
>> drivers/thermal/Kconfig | 13 +
>> drivers/thermal/Makefile | 1 +
>> drivers/thermal/thermal_dt.c | 473 +++++++++++++++++++++
>> include/dt-bindings/thermal/thermal.h | 27 ++
>> include/linux/thermal.h | 3 +
>> 6 files changed, 677 insertions(+)
>> create mode 100644 Documentation/devicetree/bindings/thermal/thermal.txt
>> create mode 100644 drivers/thermal/thermal_dt.c
>> create mode 100644 include/dt-bindings/thermal/thermal.h
>>


General question on device tree documentation. Assuming there is such an
effort to make it Linux independent, is there other places out of Linux
tree that these binding must be documented?


>> diff --git a/Documentation/devicetree/bindings/thermal/thermal.txt b/Documentation/devicetree/bindings/thermal/thermal.txt
>> new file mode 100644
>> index 0000000..af20ab0
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/thermal/thermal.txt
>> @@ -0,0 +1,160 @@
>> +----------------------------------------
>> +Thermal Framework Device Tree descriptor
>> +----------------------------------------
>> +
>> +This file describes how to define a thermal structure using device tree.
>> +A thermal structure includes thermal zones and their components, such
>> +as name, trip points, polling intervals and cooling devices binding
>> +descriptors. A binding descriptor may contain information on how to
>> +react, with a cooling stepped action or a weight on a fair share.
>> +The target of device tree thermal descriptors is to describe only
>> +the hardware thermal aspects, not how the system must control or which
>> +algorithm or policy must take place. It follows a description of each
>> +type of device tree node.
>> +
>> +****
>> +trip
>> +****
>> +
>> +The trip node is a node to describe a point in the temperature domain
>> +in which the system takes an action. This node describes just the point,
>> +not the action.
>> +
>> +A node describing a trip must contain:
>> +- temperature: the trip temperature level, in milliCelsius.
>> +- hysteresis: a (low) hysteresis value on 'temperature'. This is a relative
>> +value, in milliCelsius.
>
> Presumably this is a single (u32) cell?

Yes, both of them above are a single u32.

>
>> +- type: the trip type. Here is the type mapping:
>> + THERMAL_TRIP_ACTIVE
>> + THERMAL_TRIP_PASSIVE
>> + THERMAL_TRIP_HOT
>> + THERMAL_TRIP_CRITICAL
>
> What type is this property?
>
> What are these values? Do you need to refer to a header somewhere?

There is a header, introduced in this patch.
include/dt-bindings/thermal/thermal.h

> Symbolic constants should have an associated value for an ABI.

OK. Sounds reasonable to me. Shall we be descriptive and enlist the
values in this doc too or just pointing to header is fine?

>
>> +
>> +**********
>> +bind_param
>> +**********
>> +
>> +The bind_param node is a node to describe how cooling devices get assigned
>> +to trip points of the zone. The cooling devices are expected to be loaded
>> +in the target system.
>> +
>> +A node describing a bind_param must contain:
>> +- cooling_device: A string with the cooling device name.
>
> Please use '-' rather than '_' in bindings. That's the normal devicetree
> convention, and there's no point gonig against it and causing more
> confusion.

Agreed.

>
> What are valid values of this string, and what do they mean?

These are the name of the cooling devices to match to.

>
> Why not a phandle + cells binding to cooling devices?
>

Yeah, this could work too. I am going to check how that would look like
and come back to you.

>> +- weight: The 'influence' of a particular cooling device on this zone.
>> + This is on a percentage scale. The sum of all these weights
>> + (for a particular zone) cannot exceed 100.
>
> This is a bit fuzzy, and certainly needs more guidance.

OK. This property is used to describe the amount of cooling contribution
when activating a cooling device at a certain trip point. I will work on
a better guidance.

>
>> +- trip_mask: This is a bit mask that gives the binding relation between
>> + this thermal zone and cdev, for a particular trip point.
>> + If nth bit is set, then the cdev and thermal zone are bound
>> + for trip point n.
>
> What type is this?
>
> The nth bit from which end?
>
> Could you explain what "bound for trip point n" means?
>

It is just a way to match a cooling device with a set of trip points.
Say you want to match one specific cooling device with trips 0 and 1,
you would then create a bind_param with trip_mask = 0x3.

> Just because this is a bitmask in Linux does not mean it needs to be so
> in the dt.

Indeed.

In case we work with phandle + cell binging parameters, the trip and
weight could be passed as params.

cooling-devices = <&devX tripY weightZ>, <... >;



>
>> +- limits: An array integers consisting of 2-tuples items, and each item means
>> + lower and upper state limits, like <min-state max-state>.
>> +
>> +**************************
>> +temperature sensor devices
>> +**************************
>> +
>> +Temperature sensor devices have to be linked to a specific thermal zone.
>> +This is done by means of the 'monitored-zones' list.
>> +- monitored-zones: A list of thermal zones phandles, identifying which
>> +thermal zones the temperature sensor device is used to monitor.
>
> Why does this need to be described that way around?
>
> Can the zone not have a link to the sensor(s) it needs?
>

Do you see any issues with it?

Idea is that sensor nodes would list which zones they are used. Same
idea goes to cooling devices. It can be the other way around, yes. It
depends on what is the best way to describe these relations. Idea would
be so that sensors and cooling devices would be composing a thermal zone.

>> +
>> +************
>> +thermal_zone
>> +************
>> +
>> +The thermal_zone node is the node containing all the required info
>> +for describing a thermal zone, including its cdev bindings. The thermal_zone
>> +node must contain, apart from its own properties, one node containing
>> +trip nodes and one node containing all the zone bind parameters.
>> +
>> +Required properties:
>> +- mask: Bit string: If 'n'th bit is set, then trip point 'n' is writeable.
>
> What type is a bit string in devicetree? I'm aware of cells and (byte)
> strings, but not bit strings.
>
> What does 'writeable' mean in this context?
>
> There's presumably a better name than 'mask'.
>

This is used to describe if trip points are writeable under sysfs. I
think this is a OS implementation leak. I will be removing this from
device tree and making them RO by default.

>> +
>> +- passive_delay: number of milliseconds to wait between polls when
>> + performing passive cooling.
>> +
>> +- polling_delay: number of milliseconds to wait between polls when checking.
>
> These sound very much like configuration of the driver rather than
> hardware description. What if my temperature sensor can generate an
> interrupt at some trigger temperature? Why would I need polling times?
>
> Why does this need to be in the dt at all? Surely the OS can choose some
> sensible polling period, possibly dynamically?

Well, let me answer this one with another question. How the OS will
determine how fast is the temperature rise on a thermal zone described
in device tree?

>
>> +
>> +- #thermal-cells: An integer that is used to specify how many cells compose
>> + sensor ids.
>
> I don't see why this is necessary. If the zone itself described its
> related sensors rather than the other way around, we wouldn't need it at
> all. Most bindings so far have consumers link to their providers rather
> than the other way around.
>

I see.


>> +
>> +Below is an example:
>> +cpu_thermal: thermal_zone {
>> + #thermal-cells = <1>;
>> + mask = <0x03>; /* trips writability */
>> + passive_delay = <250>; /* milliseconds */
>> + polling_delay = <1000>; /* milliseconds */
>> + trips {
>> + alert@100000{
>> + temperature = <100000>; /* milliCelsius */
>> + hysteresis = <0>; /* milliCelsius */
>> + type = <THERMAL_TRIP_PASSIVE>;
>> + };
>> + crit@125000{
>> + temperature = <125000>; /* milliCelsius */
>> + hysteresis = <0>; /* milliCelsius */
>> + type = <THERMAL_TRIP_CRITICAL>;
>> + };
>> + };
>> + bind_params {
>> + action@0{
>> + cooling_device = "thermal-cpufreq";
>
> NAK. That value was not defined anywhere, and represents Linux
> infrastructure rather than hardware.
>

Yes, this was another leak as discussed above in the upper part of the
doc file.

> For this case, we could just as easily describe the thermal points, and
> if no physical cooling device (e.g. a fan) is present, assume
> cpufreq-base cooling. Other OSs could choose to do otherwise, and we
> could change later...
>

The way other specifications do (aka ACPI) is to have trip types. In
case a trip type is passive cooling, it means one is supposed to
modulate the dissipated power rather than activating a device to remove
heat (active cooling).

The part of 'assuming' things that bothers me. I would rather have a way
to really say one wants to have passive cooling at a specific trip point.


>> + weight = <100>; /* percentage */
>> + mask = <0x01>;
>
> This should presumably be trip-mask (assuming my suggested corrections).
>
> What is it supposed to mean here?

As stated above, it means the trip points that this binding represents.

>
>> + limits = <
>> + /* lower-state upper-state */
>> + 1 THERMAL_NO_LIMITS
>> + THERMAL_NO_LIMITS THERMAL_NO_LIMITS
>> + >;
>> + };
>> + };
>> +};
>> +
>> +sensor: adc@0xFFFF {
>> + ...
>> + monitored-zones = <&cpu_thermal 0>;
>> +};
>> +
>> +cpu@0 {
>> + ...
>> + cooling-zones = <&cpu_thermal>;
>> +};
>> +
>> +In the example above, the ADC sensor at address 0xFFFF is used to monitor
>> +the zone 'cpu_thermal' using its the sensor 0. The cpu@0 device is also linked
>> +to the same thermal zone 'cpu_thermal' as a cooling device.
>
> Huh? That seems to imply the the '0' in '<&cpu_thermal 0>' on the sensor
> node describes a property of the sensor rather than the thermal node,
> unless I've misunderstood?
>

It is essentially which sensor id is used to monitor a specific zone. In
case of sensor IPs that feature several internal sensors, one needs to
have a way to describe which sensor monitors which zone. Example of this
type of IPs: lm90 (up to 3 sensors) and TI bandgap IP (can contain up to
5 internal sensors). The probe happens for the IP device and not for the
internal sensors.


>> +
>> +***************
>> +cpufreq-cooling
>> +***************
>> +
>> +The generic cpu cooling (freq clipping) can be loaded by the
>> +generic cpufreq-cpu0 driver in case the device tree node informs
>> +this need.
>> +
>> +In order to load the cpufreq cooling device, it is possible to
>> +inform that the cpu needs cooling by adding the list of thermal zones
>> +in the 'cooling-zones' property at the cpu0 phandle.
>
> This should be reworded so as to describe the binding rather than the
> Linux behaviour based on the binding.

OK.

>
>> +
>> +Example:
>> + cpus {
>> + cpu@0 {
>> + operating-points = <
>> + /* kHz uV */
>> + 800000 1313000
>> + 1008000 1375000
>> + >;
>> + cooling-zones = <&cpu_thermal>;
>> + };
>> + ...
>> + };
>> +
>> +The above will cause the cpufreq-cpu0 driver to understand that
>> +the cpu will need passive cooling and while probing that node it will
>> +also load the cpufreq cpu cooling device in that system.
>
> The above describe that the CPU is part of a cooling (thermal?) zone and
> requires cooling.

That says the cpu participates in a thermal zone as a cooling
device/mechanism, thus cpu cooling (passive cooling) is required in
target system.

>
>> +
>> +However, be advised that this flag will not describe what needs to
>> +be done or how the cpufreq cooling device will be used. In order to
>> +accomplish this, one needs to write a phandle for a thermal zone, as
>> +described in the section 'thermal_zone'.
>
> The above already has a phandle to a thermal (cooling?) zone.
>
> Please choose either cooling zone or thermal zone, having the two names
> makes this more confusing than necessary.

I see. Idea is to have thermal zone as a node. cooling-zone is just a
list of thermal zones that the referred device will be used as cooling
device.

>
>> diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
>> index 7fb16bc..753f0dc 100644
>> --- a/drivers/thermal/Kconfig
>> +++ b/drivers/thermal/Kconfig
>> @@ -29,6 +29,19 @@ config THERMAL_HWMON
>> Say 'Y' here if you want all thermal sensors to
>> have hwmon sysfs interface too.
>>
>> +config THERMAL_OF
>> + bool
>> + prompt "APIs to parse thermal data out of device tree"
>> + depends on OF
>> + default y
>> + help
>> + This options provides helpers to add the support to
>> + read and parse thermal data definitions out of the
>> + device tree blob.
>> +
>> + Say 'Y' here if you need to build thermal infrastructure
>> + based on device tree.
>> +
>> choice
>> prompt "Default Thermal governor"
>> default THERMAL_DEFAULT_GOV_STEP_WISE
>> diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
>> index 24cb894..eedb273 100644
>> --- a/drivers/thermal/Makefile
>> +++ b/drivers/thermal/Makefile
>> @@ -7,6 +7,7 @@ thermal_sys-y += thermal_core.o
>>
>> # interface to/from other layers providing sensors
>> thermal_sys-$(CONFIG_THERMAL_HWMON) += thermal_hwmon.o
>> +thermal_sys-$(CONFIG_THERMAL_OF) += thermal_dt.o
>>
>> # governors
>> thermal_sys-$(CONFIG_THERMAL_GOV_FAIR_SHARE) += fair_share.o
>> diff --git a/drivers/thermal/thermal_dt.c b/drivers/thermal/thermal_dt.c
>> new file mode 100644
>> index 0000000..cc35485
>> --- /dev/null
>> +++ b/drivers/thermal/thermal_dt.c
>> @@ -0,0 +1,473 @@
>> +/*
>> + * thermal_dt.c - Generic Thermal Management device tree support.
>> + *
>> + * Copyright (C) 2013 Texas Instruments
>> + * Copyright (C) 2013 Eduardo Valentin <[email protected]>
>> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; version 2 of the License.
>> + *
>> + * This program is distributed in the hope that it will be useful, but
>> + * WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> + * General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along
>> + * with this program; if not, write to the Free Software Foundation, Inc.,
>> + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
>> + *
>> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> + */
>> +#include <linux/thermal.h>
>> +#include <linux/slab.h>
>> +#include <linux/types.h>
>> +#include <linux/of_device.h>
>> +#include <linux/of_platform.h>
>> +#include <linux/err.h>
>> +#include <linux/export.h>
>> +#include <linux/string.h>
>> +
>> +struct __thermal_bind_params {
>> + char cooling_device[THERMAL_NAME_LENGTH];
>> +};
>> +
>> +static
>> +int thermal_of_match(struct thermal_zone_device *tz,
>> + struct thermal_cooling_device *cdev);
>> +
>> +static int thermal_of_populate_bind_params(struct device *dev,
>> + struct device_node *node,
>> + struct __thermal_bind_params *__tbp,
>> + struct thermal_bind_params *tbp,
>> + int ntrips)
>> +{
>> + const char *cdev_name;
>> + u32 *limits;
>> + int ret;
>> + u32 prop;
>> +
>> + ret = of_property_read_u32(node, "weight", &prop);
>> + if (ret < 0) {
>> + dev_err(dev, "missing weight property\n");
>> + return ret;
>> + }
>> + tbp->weight = prop;
>> +
>> + ret = of_property_read_u32(node, "mask", &prop);
>> + if (ret < 0) {
>> + dev_err(dev, "missing mask property\n");
>> + return ret;
>> + }
>> + tbp->trip_mask = prop;
>> +
>> + /* this will allow us to bind with cooling devices */
>> + tbp->match = thermal_of_match;
>> +
>> + ret = of_property_read_string(node, "cooling_device", &cdev_name);
>> + if (ret < 0) {
>> + dev_err(dev, "missing cooling_device property\n");
>> + return ret;
>> + }
>> + strncpy(__tbp->cooling_device, cdev_name,
>> + sizeof(__tbp->cooling_device));
>> +
>> + limits = kzalloc(ntrips * sizeof(*limits) * 2, GFP_KERNEL);
>
> Why do you use kzalloc here, but devm_kzalloc elsewhere? That'll lead to
> problems as the driver gets refactored over time.

Because this memory is used only within this function and thus released
at the bottom of it.

>
>> + if (!limits) {
>> + dev_err(dev, "not enough mem for reading limits\n");
>> + return -ENOMEM;
>> + }
>> + ret = of_property_read_u32_array(node, "limits", limits, 2 * ntrips);
>> + if (!ret) {
>> + int i = ntrips;
>> +
>> + tbp->binding_limits = devm_kzalloc(dev,
>> + ntrips * 2 *
>> + sizeof(*tbp->binding_limits),
>> + GFP_KERNEL);
>> + if (!tbp->binding_limits) {
>> + dev_err(dev, "not enough mem for storing limits\n");
>> + return -ENOMEM;
>> + }
>> + while (i--)
>> + tbp->binding_limits[i] = limits[i];
>> + }
>> + kfree(limits);

here.

>> +
>> + return 0;
>
> What if you failed to read the array? Surely you should return an error?

Attempting to make this property not mandatory.

>
>> +}
>> +
>> +struct __thermal_trip {
>> + unsigned long int temperature;
>> + unsigned long int hysteresis;
>> + enum thermal_trip_type type;
>> +};
>> +
>> +static
>> +int thermal_of_populate_trip(struct device *dev,
>> + struct device_node *node,
>> + struct __thermal_trip *trip)
>> +{
>> + int ret;
>> + int prop;
>> +
>> + ret = of_property_read_u32(node, "temperature", &prop);
>> + if (ret < 0) {
>> + dev_err(dev, "missing temperature property\n");
>> + return ret;
>> + }
>> + trip->temperature = prop;
>> +
>> + ret = of_property_read_u32(node, "hysteresis", &prop);
>> + if (ret < 0) {
>> + dev_err(dev, "missing hysteresis property\n");
>> + return ret;
>> + }
>> + trip->hysteresis = prop;
>> +
>> + ret = of_property_read_u32(node, "type", &prop);
>> + if (ret < 0) {
>> + dev_err(dev, "missing type property\n");
>> + return ret;
>> + }
>> + trip->type = prop;
>
> This will require sanity checking.

OK.

>
>> +
>> + return 0;
>> +}
>> +
>> +struct __thermal_zone_device {
>> + enum thermal_device_mode mode;
>> + int passive_delay;
>> + int polling_delay;
>> + int mask;
>> + int ntrips;
>> + char type[THERMAL_NAME_LENGTH];
>> + struct __thermal_trip *trips;
>> + struct __thermal_bind_params *bind_params;
>> + struct thermal_bind_params *tbps;
>> + struct thermal_zone_params zone_params;
>> + int (*get_temp)(void *, unsigned long *);
>> + void *devdata;
>> +};
>> +
>> +static struct __thermal_zone_device *
>> +thermal_of_build_thermal_zone(struct device *dev, struct device_node *node)
>> +{
>> + struct device_node *child, *gchild;
>> + struct __thermal_zone_device *tz;
>> + int ret, i;
>> + u32 prop;
>> +
>> + if (!node) {
>> + dev_err(dev, "no thermal zone node\n");
>> + return ERR_PTR(-EINVAL);
>> + }
>> +
>> + tz = devm_kzalloc(dev, sizeof(*tz), GFP_KERNEL);
>> + if (!tz) {
>> + dev_err(dev, "not enough memory for thermal of zone\n");
>> + return ERR_PTR(-ENOMEM);
>> + }
>> +
>> + ret = of_property_read_u32(node, "passive_delay", &prop);
>> + if (ret < 0) {
>> + dev_err(dev, "missing passive_delay property\n");
>> + return ERR_PTR(ret);
>> + }
>> + tz->passive_delay = prop;
>> +
>> + ret = of_property_read_u32(node, "polling_delay", &prop);
>> + if (ret < 0) {
>> + dev_err(dev, "missing polling_delay property\n");
>> + return ERR_PTR(ret);
>> + }
>> + tz->polling_delay = prop;
>> +
>> + ret = of_property_read_u32(node, "mask", &prop);
>> + if (ret < 0) {
>> + dev_err(dev, "missing mask property\n");
>> + return ERR_PTR(ret);
>> + }
>> + tz->mask = prop;
>> +
>> + /* assume node name as our zone type */
>> + strncpy(tz->type, node->name, sizeof(tz->type));
>> +
>> + /* default policy */
>> + strncpy(tz->zone_params.governor_name, "step_wise",
>> + sizeof(tz->zone_params.governor_name));
>> +
>> + /* trips */
>> + child = of_find_node_by_name(node, "trips");
>
> That might not be a child node, as of_find_node_by_name walks over the
> list of all nodes. Use of_get_child_by_name.

Right! This is a bug.

>
>> +
>> + /* No trips provided */
>> + if (!child)
>> + goto finish;
>> +
>> + tz->ntrips = of_get_child_count(child);
>> + tz->trips = devm_kzalloc(dev, tz->ntrips * sizeof(*tz->trips),
>> + GFP_KERNEL);
>> + if (!tz->trips)
>> + return ERR_PTR(-ENOMEM);
>> + i = 0;
>> + for_each_child_of_node(child, gchild)
>> + thermal_of_populate_trip(dev, gchild, &tz->trips[i++]);
>
> What if a child node was not a valid trip node? It seems
> thermal_of_populate_trip returns errors you're ignoring.

OK. Adding proper treatment.

>
>> +
>> + /* bind_params */
>> + child = of_find_node_by_name(node, "bind_params");
>
> Another of_get_child_by_name candidate.

OK.

>
>> +
>> + /* No binding info */
>> + if (!child)
>> + goto finish;
>> +
>> + tz->zone_params.num_tbps = of_get_child_count(child);
>> + tz->bind_params = devm_kzalloc(dev,
>> + tz->zone_params.num_tbps *
>> + sizeof(*tz->bind_params),
>> + GFP_KERNEL);
>> + if (!tz->bind_params)
>> + return ERR_PTR(-ENOMEM);
>> + tz->zone_params.tbp = devm_kzalloc(dev,
>> + tz->zone_params.num_tbps *
>> + sizeof(*tz->zone_params.tbp),
>> + GFP_KERNEL);
>> + if (!tz->zone_params.tbp)
>> + return ERR_PTR(-ENOMEM);
>> + i = 0;
>> + for_each_child_of_node(child, gchild) {
>> + thermal_of_populate_bind_params(dev, gchild,
>> + &tz->bind_params[i],
>> + &tz->zone_params.tbp[i],
>> + tz->ntrips);
>
> Similarly, you're ignoring error conditions here. What if a bind_params
> node was malformed?


Adding proper error handling.

>
>> + i++;
>> + }
>> +
>> +finish:
>> + tz->mode = THERMAL_DEVICE_ENABLED;
>> +
>> + return tz;
>> +}
>> +
>> +static
>> +int thermal_of_match(struct thermal_zone_device *tz,
>> + struct thermal_cooling_device *cdev)
>> +{
>> + struct __thermal_zone_device *data = tz->devdata;
>> + int i;
>> +
>> + for (i = 0; i < data->zone_params.num_tbps; i++) {
>> + if (!strncmp(data->bind_params[i].cooling_device,
>> + cdev->type,
>> + strlen(data->bind_params[i].cooling_device)) &&
>> + (data->zone_params.tbp[i].trip_mask & (1 << i)))
>> + return 0;
>> + }
>> +
>> + return -EINVAL;
>> +}
>
> I'm not keen on binding to cooling devices using a string, as suddenly a
> name to allow humans to inspect the system is turned into an ABI.

OK. We need to align on how we describe the binding then.

>
>> +
>> +static
>> +int of_thermal_get_temp(struct thermal_zone_device *tz,
>> + unsigned long *temp)
>> +{
>> + struct __thermal_zone_device *data = tz->devdata;
>> +
>> + return data->get_temp(data->devdata, temp);
>> +}
>> +
>> +static
>> +int of_thermal_get_mode(struct thermal_zone_device *tz,
>> + enum thermal_device_mode *mode)
>> +{
>> + struct __thermal_zone_device *data = tz->devdata;
>> +
>> + *mode = data->mode;
>> +
>> + return 0;
>> +}
>> +
>> +static
>> +int of_thermal_set_mode(struct thermal_zone_device *tz,
>> + enum thermal_device_mode mode)
>> +{
>> + struct __thermal_zone_device *data = tz->devdata;
>> +
>> + mutex_lock(&tz->lock);
>> +
>> + if (mode == THERMAL_DEVICE_ENABLED)
>> + tz->polling_delay = data->polling_delay;
>> + else
>> + tz->polling_delay = 0;
>> +
>> + mutex_unlock(&tz->lock);
>> +
>> + data->mode = mode;
>> + thermal_zone_device_update(tz);
>> +
>> + return 0;
>> +}
>> +
>> +static
>> +int of_thermal_get_trip_type(struct thermal_zone_device *tz, int trip,
>> + enum thermal_trip_type *type)
>> +{
>> + struct __thermal_zone_device *data = tz->devdata;
>> +
>> + if (trip >= data->ntrips || trip < 0)
>> + return -EDOM;
>
> It's far more common to use -EINVAL.

hmm.. OK.

>
>> +
>> + *type = data->trips[trip].type;
>> +
>> + return 0;
>> +}
>> +
>> +static
>> +int of_thermal_get_trip_temp(struct thermal_zone_device *tz, int trip,
>> + unsigned long *temp)
>> +{
>> + struct __thermal_zone_device *data = tz->devdata;
>> +
>> + if (trip >= data->ntrips || trip < 0)
>> + return -EDOM;
>> +
>> + *temp = data->trips[trip].temperature;
>> +
>> + return 0;
>> +}
>> +
>> +static
>> +int of_thermal_set_trip_temp(struct thermal_zone_device *tz, int trip,
>> + unsigned long temp)
>> +{
>> + struct __thermal_zone_device *data = tz->devdata;
>> +
>> + if (trip >= data->ntrips || trip < 0)
>> + return -EDOM;
>> +
>> + /* thermal fw should take care of data->mask & (1 << trip) */
>> + data->trips[trip].temperature = temp;
>> +
>> + return 0;
>> +}
>> +
>> +static
>> +int of_thermal_get_trip_hyst(struct thermal_zone_device *tz, int trip,
>> + unsigned long *hyst)
>> +{
>> + struct __thermal_zone_device *data = tz->devdata;
>> +
>> + if (trip >= data->ntrips || trip < 0)
>> + return -EDOM;
>> +
>> + *hyst = data->trips[trip].hysteresis;
>> +
>> + return 0;
>> +}
>> +
>> +static
>> +int of_thermal_set_trip_hyst(struct thermal_zone_device *tz, int trip,
>> + unsigned long hyst)
>> +{
>> + struct __thermal_zone_device *data = tz->devdata;
>> +
>> + if (trip >= data->ntrips || trip < 0)
>> + return -EDOM;
>> +
>> + /* thermal fw should take care of data->mask & (1 << trip) */
>
> Could you elaborate on that comment?

In case the trip is not assigned to be writable, this function won't be
called.


>
>> + data->trips[trip].hysteresis = hyst;
>> +
>> + return 0;
>> +}
>
> [...]
>
> Thanks,
> Mark.
>
>


--
You have got to be excited about what you are doing. (L. Lamport)

Eduardo Valentin


Attachments:
signature.asc (295.00 B)
OpenPGP digital signature

2013-08-27 16:23:16

by Mark Rutland

[permalink] [raw]
Subject: Re: [RFC PATCH 02/14] drivers: thermal: introduce device tree parser

On Tue, Aug 27, 2013 at 02:44:40PM +0100, Eduardo Valentin wrote:
> Hello Mark,
>
>
> First of all, thanks for taking the time to review in such level of
> detail. Lets try to align and have a common understanding. Answers go
> inline. Please let me know if I missed something or if I made it more
> confusing :-)
>
> On 27-08-2013 06:22, Mark Rutland wrote:
> > Hi,
> >
> > On Sat, Aug 24, 2013 at 12:15:43AM +0100, Eduardo Valentin wrote:
> >> In order to be able to build thermal policies
> >> based on generic sensors, like I2C device, that
> >> can be places in different points on different boards,
> >> there is a need to have a way to feed board dependent
> >> data into the thermal framework.
> >>
> >> This patch introduces a thermal data parser for device
> >> tree. The parsed data is used to build thermal zones
> >> and thermal binding parameters. The output data
> >> can then be used to deploy thermal policies.
> >>
> >> This patch adds also documentation regarding this
> >> API and how to define tree nodes to use
> >> this infrastructure.
> >>
> >> Cc: Zhang Rui <[email protected]>
> >> Cc: [email protected]
> >> Cc: [email protected]
> >> Signed-off-by: Eduardo Valentin <[email protected]>
> >> ---
> >> .../devicetree/bindings/thermal/thermal.txt | 160 +++++++
> >> drivers/thermal/Kconfig | 13 +
> >> drivers/thermal/Makefile | 1 +
> >> drivers/thermal/thermal_dt.c | 473 +++++++++++++++++++++
> >> include/dt-bindings/thermal/thermal.h | 27 ++
> >> include/linux/thermal.h | 3 +
> >> 6 files changed, 677 insertions(+)
> >> create mode 100644 Documentation/devicetree/bindings/thermal/thermal.txt
> >> create mode 100644 drivers/thermal/thermal_dt.c
> >> create mode 100644 include/dt-bindings/thermal/thermal.h
> >>
>
>
> General question on device tree documentation. Assuming there is such an
> effort to make it Linux independent, is there other places out of Linux
> tree that these binding must be documented?

At present, no. There is a plan to move the documentation out of the
kernel, but I'm not sure on how that's currently progressing. Some
general cleanup needs to occur before we can actually do that.

>
>
> >> diff --git a/Documentation/devicetree/bindings/thermal/thermal.txt b/Documentation/devicetree/bindings/thermal/thermal.txt
> >> new file mode 100644
> >> index 0000000..af20ab0
> >> --- /dev/null
> >> +++ b/Documentation/devicetree/bindings/thermal/thermal.txt
> >> @@ -0,0 +1,160 @@
> >> +----------------------------------------
> >> +Thermal Framework Device Tree descriptor
> >> +----------------------------------------
> >> +
> >> +This file describes how to define a thermal structure using device tree.
> >> +A thermal structure includes thermal zones and their components, such
> >> +as name, trip points, polling intervals and cooling devices binding
> >> +descriptors. A binding descriptor may contain information on how to
> >> +react, with a cooling stepped action or a weight on a fair share.
> >> +The target of device tree thermal descriptors is to describe only
> >> +the hardware thermal aspects, not how the system must control or which
> >> +algorithm or policy must take place. It follows a description of each
> >> +type of device tree node.
> >> +
> >> +****
> >> +trip
> >> +****
> >> +
> >> +The trip node is a node to describe a point in the temperature domain
> >> +in which the system takes an action. This node describes just the point,
> >> +not the action.
> >> +
> >> +A node describing a trip must contain:
> >> +- temperature: the trip temperature level, in milliCelsius.
> >> +- hysteresis: a (low) hysteresis value on 'temperature'. This is a relative
> >> +value, in milliCelsius.
> >
> > Presumably this is a single (u32) cell?
>
> Yes, both of them above are a single u32.

Ok. This seems to be the default assumption in many bindings, so I
shan't argue against it here, but in general we should be as precise as
possible as to the format of the devicetree.

>
> >
> >> +- type: the trip type. Here is the type mapping:
> >> + THERMAL_TRIP_ACTIVE
> >> + THERMAL_TRIP_PASSIVE
> >> + THERMAL_TRIP_HOT
> >> + THERMAL_TRIP_CRITICAL
> >
> > What type is this property?
> >
> > What are these values? Do you need to refer to a header somewhere?
>
> There is a header, introduced in this patch.
> include/dt-bindings/thermal/thermal.h
>
> > Symbolic constants should have an associated value for an ABI.
>
> OK. Sounds reasonable to me. Shall we be descriptive and enlist the
> values in this doc too or just pointing to header is fine?

I believe that referring to the header is fine, though personally I'd
prefer if we were explicit in binding docs. I'll leave Stephen to
comment on what the preferred way of handling header constants with
binding documents is.

>
> >
> >> +
> >> +**********
> >> +bind_param
> >> +**********
> >> +
> >> +The bind_param node is a node to describe how cooling devices get assigned
> >> +to trip points of the zone. The cooling devices are expected to be loaded
> >> +in the target system.
> >> +
> >> +A node describing a bind_param must contain:
> >> +- cooling_device: A string with the cooling device name.
> >
> > Please use '-' rather than '_' in bindings. That's the normal devicetree
> > convention, and there's no point gonig against it and causing more
> > confusion.
>
> Agreed.
>
> >
> > What are valid values of this string, and what do they mean?
>
> These are the name of the cooling devices to match to.

Which are what exactly? What constitutes a "cooling device"?

Is "black-body radiation" [1] a valid cooling device?

How about "copper heat sink"?

>
> >
> > Why not a phandle + cells binding to cooling devices?
> >
>
> Yeah, this could work too. I am going to check how that would look like
> and come back to you.

Ok.

>
> >> +- weight: The 'influence' of a particular cooling device on this zone.
> >> + This is on a percentage scale. The sum of all these weights
> >> + (for a particular zone) cannot exceed 100.
> >
> > This is a bit fuzzy, and certainly needs more guidance.
>
> OK. This property is used to describe the amount of cooling contribution
> when activating a cooling device at a certain trip point. I will work on
> a better guidance.
>
> >
> >> +- trip_mask: This is a bit mask that gives the binding relation between
> >> + this thermal zone and cdev, for a particular trip point.
> >> + If nth bit is set, then the cdev and thermal zone are bound
> >> + for trip point n.
> >
> > What type is this?
> >
> > The nth bit from which end?
> >
> > Could you explain what "bound for trip point n" means?
> >
>
> It is just a way to match a cooling device with a set of trip points.
> Say you want to match one specific cooling device with trips 0 and 1,
> you would then create a bind_param with trip_mask = 0x3.
>
> > Just because this is a bitmask in Linux does not mean it needs to be so
> > in the dt.
>
> Indeed.
>
> In case we work with phandle + cell binging parameters, the trip and
> weight could be passed as params.
>
> cooling-devices = <&devX tripY weightZ>, <... >;

Something like that could certainly work.

>
> >
> >> +- limits: An array integers consisting of 2-tuples items, and each item means
> >> + lower and upper state limits, like <min-state max-state>.
> >> +
> >> +**************************
> >> +temperature sensor devices
> >> +**************************
> >> +
> >> +Temperature sensor devices have to be linked to a specific thermal zone.
> >> +This is done by means of the 'monitored-zones' list.
> >> +- monitored-zones: A list of thermal zones phandles, identifying which
> >> +thermal zones the temperature sensor device is used to monitor.
> >
> > Why does this need to be described that way around?
> >
> > Can the zone not have a link to the sensor(s) it needs?
> >
>
> Do you see any issues with it?
>
> Idea is that sensor nodes would list which zones they are used. Same
> idea goes to cooling devices. It can be the other way around, yes. It
> depends on what is the best way to describe these relations. Idea would
> be so that sensors and cooling devices would be composing a thermal zone.

I think that having a zone refer to the sensors that monitor it would be
preferable. Otherwise we're encoding that way a devices output is to be
consumed on the binding for the device itself, which feels odd (it's
arguable as to whether it's a property of the device, however).

>
> >> +
> >> +************
> >> +thermal_zone
> >> +************
> >> +
> >> +The thermal_zone node is the node containing all the required info
> >> +for describing a thermal zone, including its cdev bindings. The thermal_zone
> >> +node must contain, apart from its own properties, one node containing
> >> +trip nodes and one node containing all the zone bind parameters.
> >> +
> >> +Required properties:
> >> +- mask: Bit string: If 'n'th bit is set, then trip point 'n' is writeable.
> >
> > What type is a bit string in devicetree? I'm aware of cells and (byte)
> > strings, but not bit strings.
> >
> > What does 'writeable' mean in this context?
> >
> > There's presumably a better name than 'mask'.
> >
>
> This is used to describe if trip points are writeable under sysfs. I
> think this is a OS implementation leak. I will be removing this from
> device tree and making them RO by default.

Ok.

>
> >> +
> >> +- passive_delay: number of milliseconds to wait between polls when
> >> + performing passive cooling.
> >> +
> >> +- polling_delay: number of milliseconds to wait between polls when checking.
> >
> > These sound very much like configuration of the driver rather than
> > hardware description. What if my temperature sensor can generate an
> > interrupt at some trigger temperature? Why would I need polling times?
> >
> > Why does this need to be in the dt at all? Surely the OS can choose some
> > sensible polling period, possibly dynamically?
>
> Well, let me answer this one with another question. How the OS will
> determine how fast is the temperature rise on a thermal zone described
> in device tree?

How will the devicetree writer determine this?

It's possible for the thermal management code to dynamically adjust the
rate between some extreme polling intervals based on how big a rise it
detects between polls or possibly something more complex taking into
account some metric for load.

>
> >
> >> +
> >> +- #thermal-cells: An integer that is used to specify how many cells compose
> >> + sensor ids.
> >
> > I don't see why this is necessary. If the zone itself described its
> > related sensors rather than the other way around, we wouldn't need it at
> > all. Most bindings so far have consumers link to their providers rather
> > than the other way around.
> >
>
> I see.
>
>
> >> +
> >> +Below is an example:
> >> +cpu_thermal: thermal_zone {
> >> + #thermal-cells = <1>;
> >> + mask = <0x03>; /* trips writability */
> >> + passive_delay = <250>; /* milliseconds */
> >> + polling_delay = <1000>; /* milliseconds */
> >> + trips {
> >> + alert@100000{
> >> + temperature = <100000>; /* milliCelsius */
> >> + hysteresis = <0>; /* milliCelsius */
> >> + type = <THERMAL_TRIP_PASSIVE>;
> >> + };
> >> + crit@125000{
> >> + temperature = <125000>; /* milliCelsius */
> >> + hysteresis = <0>; /* milliCelsius */
> >> + type = <THERMAL_TRIP_CRITICAL>;
> >> + };

I didn't spot this on my first round, but the trips node will need
#address-cells and #size-cells, and the sub nodes unit-addresses should
match their reg properties.

> >> + };
> >> + bind_params {
> >> + action@0{
> >> + cooling_device = "thermal-cpufreq";
> >
> > NAK. That value was not defined anywhere, and represents Linux
> > infrastructure rather than hardware.
> >
>
> Yes, this was another leak as discussed above in the upper part of the
> doc file.
>
> > For this case, we could just as easily describe the thermal points, and
> > if no physical cooling device (e.g. a fan) is present, assume
> > cpufreq-base cooling. Other OSs could choose to do otherwise, and we
> > could change later...
> >
>
> The way other specifications do (aka ACPI) is to have trip types. In
> case a trip type is passive cooling, it means one is supposed to
> modulate the dissipated power rather than activating a device to remove
> heat (active cooling).
>
> The part of 'assuming' things that bothers me. I would rather have a way
> to really say one wants to have passive cooling at a specific trip point.

I don't see the problem here. If you don't specify a device, your only
options will be some sort of passive management.

>
>
> >> + weight = <100>; /* percentage */
> >> + mask = <0x01>;
> >
> > This should presumably be trip-mask (assuming my suggested corrections).
> >
> > What is it supposed to mean here?
>
> As stated above, it means the trip points that this binding represents.

If you have more than 32 trip points (which is admittedly unlikely), how
do you represent them? I'm not sure a mask is the best representation
here, as it's difficult for a human to match up with a list of entries.

Additionally, the trip points don't have a logical contiguous numbering
from zero, so how do the bits in the mask correspond to entries in the
list? This seems to rely on the order that the subnodes are in, which
AFAIK is not guaranteed. It would be far nicer to have an explicit
linkage.

>
> >
> >> + limits = <
> >> + /* lower-state upper-state */
> >> + 1 THERMAL_NO_LIMITS
> >> + THERMAL_NO_LIMITS THERMAL_NO_LIMITS
> >> + >;
> >> + };
> >> + };
> >> +};
> >> +
> >> +sensor: adc@0xFFFF {
> >> + ...
> >> + monitored-zones = <&cpu_thermal 0>;
> >> +};
> >> +
> >> +cpu@0 {
> >> + ...
> >> + cooling-zones = <&cpu_thermal>;
> >> +};
> >> +
> >> +In the example above, the ADC sensor at address 0xFFFF is used to monitor
> >> +the zone 'cpu_thermal' using its the sensor 0. The cpu@0 device is also linked
> >> +to the same thermal zone 'cpu_thermal' as a cooling device.
> >
> > Huh? That seems to imply the the '0' in '<&cpu_thermal 0>' on the sensor
> > node describes a property of the sensor rather than the thermal node,
> > unless I've misunderstood?
> >
>
> It is essentially which sensor id is used to monitor a specific zone. In
> case of sensor IPs that feature several internal sensors, one needs to
> have a way to describe which sensor monitors which zone. Example of this
> type of IPs: lm90 (up to 3 sensors) and TI bandgap IP (can contain up to
> 5 internal sensors). The probe happens for the IP device and not for the
> internal sensors.

Which is exactly why this should be the opposite way round, with thermal
zones referring to sensors in this fashion.

The way you're suggesting means that the phandle+cells specifier refers
half to the consumer and half to the producer. Having thermal zones
refer to sensors would make this refer solely to the producer (the
sensor).

>
>
> >> +
> >> +***************
> >> +cpufreq-cooling
> >> +***************
> >> +
> >> +The generic cpu cooling (freq clipping) can be loaded by the
> >> +generic cpufreq-cpu0 driver in case the device tree node informs
> >> +this need.
> >> +
> >> +In order to load the cpufreq cooling device, it is possible to
> >> +inform that the cpu needs cooling by adding the list of thermal zones
> >> +in the 'cooling-zones' property at the cpu0 phandle.
> >
> > This should be reworded so as to describe the binding rather than the
> > Linux behaviour based on the binding.
>
> OK.
>
> >
> >> +
> >> +Example:
> >> + cpus {
> >> + cpu@0 {
> >> + operating-points = <
> >> + /* kHz uV */
> >> + 800000 1313000
> >> + 1008000 1375000
> >> + >;
> >> + cooling-zones = <&cpu_thermal>;
> >> + };
> >> + ...
> >> + };
> >> +
> >> +The above will cause the cpufreq-cpu0 driver to understand that
> >> +the cpu will need passive cooling and while probing that node it will
> >> +also load the cpufreq cpu cooling device in that system.
> >
> > The above describe that the CPU is part of a cooling (thermal?) zone and
> > requires cooling.
>
> That says the cpu participates in a thermal zone as a cooling
> device/mechanism, thus cpu cooling (passive cooling) is required in
> target system.

I think you need the rest of the nodes in this example. You don't show
whether or not there's a cooling-device, so it's not clear that the CPU
is required to be passively cooled...

>
> >
> >> +
> >> +However, be advised that this flag will not describe what needs to
> >> +be done or how the cpufreq cooling device will be used. In order to
> >> +accomplish this, one needs to write a phandle for a thermal zone, as
> >> +described in the section 'thermal_zone'.
> >
> > The above already has a phandle to a thermal (cooling?) zone.
> >
> > Please choose either cooling zone or thermal zone, having the two names
> > makes this more confusing than necessary.
>
> I see. Idea is to have thermal zone as a node. cooling-zone is just a
> list of thermal zones that the referred device will be used as cooling
> device.

While I can see a distinction, the nomenclature is horrendously
confusing.

>
> >
> >> diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
> >> index 7fb16bc..753f0dc 100644
> >> --- a/drivers/thermal/Kconfig
> >> +++ b/drivers/thermal/Kconfig
> >> @@ -29,6 +29,19 @@ config THERMAL_HWMON
> >> Say 'Y' here if you want all thermal sensors to
> >> have hwmon sysfs interface too.
> >>
> >> +config THERMAL_OF
> >> + bool
> >> + prompt "APIs to parse thermal data out of device tree"
> >> + depends on OF
> >> + default y
> >> + help
> >> + This options provides helpers to add the support to
> >> + read and parse thermal data definitions out of the
> >> + device tree blob.
> >> +
> >> + Say 'Y' here if you need to build thermal infrastructure
> >> + based on device tree.
> >> +
> >> choice
> >> prompt "Default Thermal governor"
> >> default THERMAL_DEFAULT_GOV_STEP_WISE
> >> diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
> >> index 24cb894..eedb273 100644
> >> --- a/drivers/thermal/Makefile
> >> +++ b/drivers/thermal/Makefile
> >> @@ -7,6 +7,7 @@ thermal_sys-y += thermal_core.o
> >>
> >> # interface to/from other layers providing sensors
> >> thermal_sys-$(CONFIG_THERMAL_HWMON) += thermal_hwmon.o
> >> +thermal_sys-$(CONFIG_THERMAL_OF) += thermal_dt.o
> >>
> >> # governors
> >> thermal_sys-$(CONFIG_THERMAL_GOV_FAIR_SHARE) += fair_share.o
> >> diff --git a/drivers/thermal/thermal_dt.c b/drivers/thermal/thermal_dt.c
> >> new file mode 100644
> >> index 0000000..cc35485
> >> --- /dev/null
> >> +++ b/drivers/thermal/thermal_dt.c
> >> @@ -0,0 +1,473 @@
> >> +/*
> >> + * thermal_dt.c - Generic Thermal Management device tree support.
> >> + *
> >> + * Copyright (C) 2013 Texas Instruments
> >> + * Copyright (C) 2013 Eduardo Valentin <[email protected]>
> >> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >> + *
> >> + * This program is free software; you can redistribute it and/or modify
> >> + * it under the terms of the GNU General Public License as published by
> >> + * the Free Software Foundation; version 2 of the License.
> >> + *
> >> + * This program is distributed in the hope that it will be useful, but
> >> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> >> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> >> + * General Public License for more details.
> >> + *
> >> + * You should have received a copy of the GNU General Public License along
> >> + * with this program; if not, write to the Free Software Foundation, Inc.,
> >> + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
> >> + *
> >> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >> + */
> >> +#include <linux/thermal.h>
> >> +#include <linux/slab.h>
> >> +#include <linux/types.h>
> >> +#include <linux/of_device.h>
> >> +#include <linux/of_platform.h>
> >> +#include <linux/err.h>
> >> +#include <linux/export.h>
> >> +#include <linux/string.h>
> >> +
> >> +struct __thermal_bind_params {
> >> + char cooling_device[THERMAL_NAME_LENGTH];
> >> +};
> >> +
> >> +static
> >> +int thermal_of_match(struct thermal_zone_device *tz,
> >> + struct thermal_cooling_device *cdev);
> >> +
> >> +static int thermal_of_populate_bind_params(struct device *dev,
> >> + struct device_node *node,
> >> + struct __thermal_bind_params *__tbp,
> >> + struct thermal_bind_params *tbp,
> >> + int ntrips)
> >> +{
> >> + const char *cdev_name;
> >> + u32 *limits;
> >> + int ret;
> >> + u32 prop;
> >> +
> >> + ret = of_property_read_u32(node, "weight", &prop);
> >> + if (ret < 0) {
> >> + dev_err(dev, "missing weight property\n");
> >> + return ret;
> >> + }
> >> + tbp->weight = prop;
> >> +
> >> + ret = of_property_read_u32(node, "mask", &prop);
> >> + if (ret < 0) {
> >> + dev_err(dev, "missing mask property\n");
> >> + return ret;
> >> + }
> >> + tbp->trip_mask = prop;
> >> +
> >> + /* this will allow us to bind with cooling devices */
> >> + tbp->match = thermal_of_match;
> >> +
> >> + ret = of_property_read_string(node, "cooling_device", &cdev_name);
> >> + if (ret < 0) {
> >> + dev_err(dev, "missing cooling_device property\n");
> >> + return ret;
> >> + }
> >> + strncpy(__tbp->cooling_device, cdev_name,
> >> + sizeof(__tbp->cooling_device));
> >> +
> >> + limits = kzalloc(ntrips * sizeof(*limits) * 2, GFP_KERNEL);
> >
> > Why do you use kzalloc here, but devm_kzalloc elsewhere? That'll lead to
> > problems as the driver gets refactored over time.
>
> Because this memory is used only within this function and thus released
> at the bottom of it.
>
> >
> >> + if (!limits) {
> >> + dev_err(dev, "not enough mem for reading limits\n");
> >> + return -ENOMEM;
> >> + }
> >> + ret = of_property_read_u32_array(node, "limits", limits, 2 * ntrips);
> >> + if (!ret) {
> >> + int i = ntrips;
> >> +
> >> + tbp->binding_limits = devm_kzalloc(dev,
> >> + ntrips * 2 *
> >> + sizeof(*tbp->binding_limits),
> >> + GFP_KERNEL);
> >> + if (!tbp->binding_limits) {
> >> + dev_err(dev, "not enough mem for storing limits\n");
> >> + return -ENOMEM;
> >> + }
> >> + while (i--)
> >> + tbp->binding_limits[i] = limits[i];
> >> + }
> >> + kfree(limits);
>
> here.

I saw that, I did say that this may lead to problems in future if the
function's reorganised, not now. :)

>
> >> +
> >> + return 0;
> >
> > What if you failed to read the array? Surely you should return an error?
>
> Attempting to make this property not mandatory.

That wasn't clear from the binding, and now I've re-read it, I'm still
not entirely clear on what the limits represent, and why you may have
multiple sets of pairs.

>
> >
> >> +}
> >> +
> >> +struct __thermal_trip {
> >> + unsigned long int temperature;
> >> + unsigned long int hysteresis;
> >> + enum thermal_trip_type type;
> >> +};
> >> +
> >> +static
> >> +int thermal_of_populate_trip(struct device *dev,
> >> + struct device_node *node,
> >> + struct __thermal_trip *trip)
> >> +{
> >> + int ret;
> >> + int prop;
> >> +
> >> + ret = of_property_read_u32(node, "temperature", &prop);
> >> + if (ret < 0) {
> >> + dev_err(dev, "missing temperature property\n");
> >> + return ret;
> >> + }
> >> + trip->temperature = prop;
> >> +
> >> + ret = of_property_read_u32(node, "hysteresis", &prop);
> >> + if (ret < 0) {
> >> + dev_err(dev, "missing hysteresis property\n");
> >> + return ret;
> >> + }
> >> + trip->hysteresis = prop;
> >> +
> >> + ret = of_property_read_u32(node, "type", &prop);
> >> + if (ret < 0) {
> >> + dev_err(dev, "missing type property\n");
> >> + return ret;
> >> + }
> >> + trip->type = prop;
> >
> > This will require sanity checking.
>
> OK.
>
> >
> >> +
> >> + return 0;
> >> +}
> >> +
> >> +struct __thermal_zone_device {
> >> + enum thermal_device_mode mode;
> >> + int passive_delay;
> >> + int polling_delay;
> >> + int mask;
> >> + int ntrips;
> >> + char type[THERMAL_NAME_LENGTH];
> >> + struct __thermal_trip *trips;
> >> + struct __thermal_bind_params *bind_params;
> >> + struct thermal_bind_params *tbps;
> >> + struct thermal_zone_params zone_params;
> >> + int (*get_temp)(void *, unsigned long *);
> >> + void *devdata;
> >> +};
> >> +
> >> +static struct __thermal_zone_device *
> >> +thermal_of_build_thermal_zone(struct device *dev, struct device_node *node)
> >> +{
> >> + struct device_node *child, *gchild;
> >> + struct __thermal_zone_device *tz;
> >> + int ret, i;
> >> + u32 prop;
> >> +
> >> + if (!node) {
> >> + dev_err(dev, "no thermal zone node\n");
> >> + return ERR_PTR(-EINVAL);
> >> + }
> >> +
> >> + tz = devm_kzalloc(dev, sizeof(*tz), GFP_KERNEL);
> >> + if (!tz) {
> >> + dev_err(dev, "not enough memory for thermal of zone\n");
> >> + return ERR_PTR(-ENOMEM);
> >> + }
> >> +
> >> + ret = of_property_read_u32(node, "passive_delay", &prop);
> >> + if (ret < 0) {
> >> + dev_err(dev, "missing passive_delay property\n");
> >> + return ERR_PTR(ret);
> >> + }
> >> + tz->passive_delay = prop;
> >> +
> >> + ret = of_property_read_u32(node, "polling_delay", &prop);
> >> + if (ret < 0) {
> >> + dev_err(dev, "missing polling_delay property\n");
> >> + return ERR_PTR(ret);
> >> + }
> >> + tz->polling_delay = prop;
> >> +
> >> + ret = of_property_read_u32(node, "mask", &prop);
> >> + if (ret < 0) {
> >> + dev_err(dev, "missing mask property\n");
> >> + return ERR_PTR(ret);
> >> + }
> >> + tz->mask = prop;
> >> +
> >> + /* assume node name as our zone type */
> >> + strncpy(tz->type, node->name, sizeof(tz->type));
> >> +
> >> + /* default policy */
> >> + strncpy(tz->zone_params.governor_name, "step_wise",
> >> + sizeof(tz->zone_params.governor_name));
> >> +
> >> + /* trips */
> >> + child = of_find_node_by_name(node, "trips");
> >
> > That might not be a child node, as of_find_node_by_name walks over the
> > list of all nodes. Use of_get_child_by_name.
>
> Right! This is a bug.
>
> >
> >> +
> >> + /* No trips provided */
> >> + if (!child)
> >> + goto finish;
> >> +
> >> + tz->ntrips = of_get_child_count(child);
> >> + tz->trips = devm_kzalloc(dev, tz->ntrips * sizeof(*tz->trips),
> >> + GFP_KERNEL);
> >> + if (!tz->trips)
> >> + return ERR_PTR(-ENOMEM);
> >> + i = 0;
> >> + for_each_child_of_node(child, gchild)
> >> + thermal_of_populate_trip(dev, gchild, &tz->trips[i++]);
> >
> > What if a child node was not a valid trip node? It seems
> > thermal_of_populate_trip returns errors you're ignoring.
>
> OK. Adding proper treatment.
>
> >
> >> +
> >> + /* bind_params */
> >> + child = of_find_node_by_name(node, "bind_params");
> >
> > Another of_get_child_by_name candidate.
>
> OK.
>
> >
> >> +
> >> + /* No binding info */
> >> + if (!child)
> >> + goto finish;
> >> +
> >> + tz->zone_params.num_tbps = of_get_child_count(child);
> >> + tz->bind_params = devm_kzalloc(dev,
> >> + tz->zone_params.num_tbps *
> >> + sizeof(*tz->bind_params),
> >> + GFP_KERNEL);
> >> + if (!tz->bind_params)
> >> + return ERR_PTR(-ENOMEM);
> >> + tz->zone_params.tbp = devm_kzalloc(dev,
> >> + tz->zone_params.num_tbps *
> >> + sizeof(*tz->zone_params.tbp),
> >> + GFP_KERNEL);
> >> + if (!tz->zone_params.tbp)
> >> + return ERR_PTR(-ENOMEM);
> >> + i = 0;
> >> + for_each_child_of_node(child, gchild) {
> >> + thermal_of_populate_bind_params(dev, gchild,
> >> + &tz->bind_params[i],
> >> + &tz->zone_params.tbp[i],
> >> + tz->ntrips);
> >
> > Similarly, you're ignoring error conditions here. What if a bind_params
> > node was malformed?
>
>
> Adding proper error handling.
>
> >
> >> + i++;
> >> + }
> >> +
> >> +finish:
> >> + tz->mode = THERMAL_DEVICE_ENABLED;
> >> +
> >> + return tz;
> >> +}
> >> +
> >> +static
> >> +int thermal_of_match(struct thermal_zone_device *tz,
> >> + struct thermal_cooling_device *cdev)
> >> +{
> >> + struct __thermal_zone_device *data = tz->devdata;
> >> + int i;
> >> +
> >> + for (i = 0; i < data->zone_params.num_tbps; i++) {
> >> + if (!strncmp(data->bind_params[i].cooling_device,
> >> + cdev->type,
> >> + strlen(data->bind_params[i].cooling_device)) &&
> >> + (data->zone_params.tbp[i].trip_mask & (1 << i)))
> >> + return 0;
> >> + }
> >> +
> >> + return -EINVAL;
> >> +}
> >
> > I'm not keen on binding to cooling devices using a string, as suddenly a
> > name to allow humans to inspect the system is turned into an ABI.
>
> OK. We need to align on how we describe the binding then.

Definitely. We need to ensure that we fully define the relationship too
cooling devices, and don't suddenly expose some piece of internal Linux
infrastructure to the world.

>
> >
> >> +
> >> +static
> >> +int of_thermal_get_temp(struct thermal_zone_device *tz,
> >> + unsigned long *temp)
> >> +{
> >> + struct __thermal_zone_device *data = tz->devdata;
> >> +
> >> + return data->get_temp(data->devdata, temp);
> >> +}
> >> +
> >> +static
> >> +int of_thermal_get_mode(struct thermal_zone_device *tz,
> >> + enum thermal_device_mode *mode)
> >> +{
> >> + struct __thermal_zone_device *data = tz->devdata;
> >> +
> >> + *mode = data->mode;
> >> +
> >> + return 0;
> >> +}
> >> +
> >> +static
> >> +int of_thermal_set_mode(struct thermal_zone_device *tz,
> >> + enum thermal_device_mode mode)
> >> +{
> >> + struct __thermal_zone_device *data = tz->devdata;
> >> +
> >> + mutex_lock(&tz->lock);
> >> +
> >> + if (mode == THERMAL_DEVICE_ENABLED)
> >> + tz->polling_delay = data->polling_delay;
> >> + else
> >> + tz->polling_delay = 0;
> >> +
> >> + mutex_unlock(&tz->lock);
> >> +
> >> + data->mode = mode;
> >> + thermal_zone_device_update(tz);
> >> +
> >> + return 0;
> >> +}
> >> +
> >> +static
> >> +int of_thermal_get_trip_type(struct thermal_zone_device *tz, int trip,
> >> + enum thermal_trip_type *type)
> >> +{
> >> + struct __thermal_zone_device *data = tz->devdata;
> >> +
> >> + if (trip >= data->ntrips || trip < 0)
> >> + return -EDOM;
> >
> > It's far more common to use -EINVAL.
>
> hmm.. OK.
>
> >
> >> +
> >> + *type = data->trips[trip].type;
> >> +
> >> + return 0;
> >> +}
> >> +
> >> +static
> >> +int of_thermal_get_trip_temp(struct thermal_zone_device *tz, int trip,
> >> + unsigned long *temp)
> >> +{
> >> + struct __thermal_zone_device *data = tz->devdata;
> >> +
> >> + if (trip >= data->ntrips || trip < 0)
> >> + return -EDOM;
> >> +
> >> + *temp = data->trips[trip].temperature;
> >> +
> >> + return 0;
> >> +}
> >> +
> >> +static
> >> +int of_thermal_set_trip_temp(struct thermal_zone_device *tz, int trip,
> >> + unsigned long temp)
> >> +{
> >> + struct __thermal_zone_device *data = tz->devdata;
> >> +
> >> + if (trip >= data->ntrips || trip < 0)
> >> + return -EDOM;
> >> +
> >> + /* thermal fw should take care of data->mask & (1 << trip) */
> >> + data->trips[trip].temperature = temp;
> >> +
> >> + return 0;
> >> +}
> >> +
> >> +static
> >> +int of_thermal_get_trip_hyst(struct thermal_zone_device *tz, int trip,
> >> + unsigned long *hyst)
> >> +{
> >> + struct __thermal_zone_device *data = tz->devdata;
> >> +
> >> + if (trip >= data->ntrips || trip < 0)
> >> + return -EDOM;
> >> +
> >> + *hyst = data->trips[trip].hysteresis;
> >> +
> >> + return 0;
> >> +}
> >> +
> >> +static
> >> +int of_thermal_set_trip_hyst(struct thermal_zone_device *tz, int trip,
> >> + unsigned long hyst)
> >> +{
> >> + struct __thermal_zone_device *data = tz->devdata;
> >> +
> >> + if (trip >= data->ntrips || trip < 0)
> >> + return -EDOM;
> >> +
> >> + /* thermal fw should take care of data->mask & (1 << trip) */
> >
> > Could you elaborate on that comment?
>
> In case the trip is not assigned to be writable, this function won't be
> called.

Ah. I misunderstood "fw" as "firmware", rather than "framework", and got
confused. It would be nice if you could expand "fw" to make this
clearer. :)

Thanks,
Mark.

[1] http://en.wikipedia.org/wiki/Black-body_radiation

2013-08-27 18:18:45

by Eduardo Valentin

[permalink] [raw]
Subject: Re: [RFC PATCH 02/14] drivers: thermal: introduce device tree parser

On 27-08-2013 12:23, Mark Rutland wrote:
> On Tue, Aug 27, 2013 at 02:44:40PM +0100, Eduardo Valentin wrote:
>> Hello Mark,
>>
>>
>> First of all, thanks for taking the time to review in such level of
>> detail. Lets try to align and have a common understanding. Answers go
>> inline. Please let me know if I missed something or if I made it more
>> confusing :-)
>>
>> On 27-08-2013 06:22, Mark Rutland wrote:
>>> Hi,
>>>
>>> On Sat, Aug 24, 2013 at 12:15:43AM +0100, Eduardo Valentin wrote:
>>>> In order to be able to build thermal policies
>>>> based on generic sensors, like I2C device, that
>>>> can be places in different points on different boards,
>>>> there is a need to have a way to feed board dependent
>>>> data into the thermal framework.
>>>>
>>>> This patch introduces a thermal data parser for device
>>>> tree. The parsed data is used to build thermal zones
>>>> and thermal binding parameters. The output data
>>>> can then be used to deploy thermal policies.
>>>>
>>>> This patch adds also documentation regarding this
>>>> API and how to define tree nodes to use
>>>> this infrastructure.
>>>>
>>>> Cc: Zhang Rui <[email protected]>
>>>> Cc: [email protected]
>>>> Cc: [email protected]
>>>> Signed-off-by: Eduardo Valentin <[email protected]>
>>>> ---
>>>> .../devicetree/bindings/thermal/thermal.txt | 160 +++++++
>>>> drivers/thermal/Kconfig | 13 +
>>>> drivers/thermal/Makefile | 1 +
>>>> drivers/thermal/thermal_dt.c | 473 +++++++++++++++++++++
>>>> include/dt-bindings/thermal/thermal.h | 27 ++
>>>> include/linux/thermal.h | 3 +
>>>> 6 files changed, 677 insertions(+)
>>>> create mode 100644 Documentation/devicetree/bindings/thermal/thermal.txt
>>>> create mode 100644 drivers/thermal/thermal_dt.c
>>>> create mode 100644 include/dt-bindings/thermal/thermal.h
>>>>
>>
>>
>> General question on device tree documentation. Assuming there is such an
>> effort to make it Linux independent, is there other places out of Linux
>> tree that these binding must be documented?
>
> At present, no. There is a plan to move the documentation out of the
> kernel, but I'm not sure on how that's currently progressing. Some
> general cleanup needs to occur before we can actually do that.
>

OK.

>>
>>
>>>> diff --git a/Documentation/devicetree/bindings/thermal/thermal.txt b/Documentation/devicetree/bindings/thermal/thermal.txt
>>>> new file mode 100644
>>>> index 0000000..af20ab0
>>>> --- /dev/null
>>>> +++ b/Documentation/devicetree/bindings/thermal/thermal.txt
>>>> @@ -0,0 +1,160 @@
>>>> +----------------------------------------
>>>> +Thermal Framework Device Tree descriptor
>>>> +----------------------------------------
>>>> +
>>>> +This file describes how to define a thermal structure using device tree.
>>>> +A thermal structure includes thermal zones and their components, such
>>>> +as name, trip points, polling intervals and cooling devices binding
>>>> +descriptors. A binding descriptor may contain information on how to
>>>> +react, with a cooling stepped action or a weight on a fair share.
>>>> +The target of device tree thermal descriptors is to describe only
>>>> +the hardware thermal aspects, not how the system must control or which
>>>> +algorithm or policy must take place. It follows a description of each
>>>> +type of device tree node.
>>>> +
>>>> +****
>>>> +trip
>>>> +****
>>>> +
>>>> +The trip node is a node to describe a point in the temperature domain
>>>> +in which the system takes an action. This node describes just the point,
>>>> +not the action.
>>>> +
>>>> +A node describing a trip must contain:
>>>> +- temperature: the trip temperature level, in milliCelsius.
>>>> +- hysteresis: a (low) hysteresis value on 'temperature'. This is a relative
>>>> +value, in milliCelsius.
>>>
>>> Presumably this is a single (u32) cell?
>>
>> Yes, both of them above are a single u32.
>
> Ok. This seems to be the default assumption in many bindings, so I
> shan't argue against it here, but in general we should be as precise as
> possible as to the format of the devicetree.
>

Yeah, I followed the default assumption. It does not hurt to be
descriptive though.

>>
>>>
>>>> +- type: the trip type. Here is the type mapping:
>>>> + THERMAL_TRIP_ACTIVE
>>>> + THERMAL_TRIP_PASSIVE
>>>> + THERMAL_TRIP_HOT
>>>> + THERMAL_TRIP_CRITICAL
>>>
>>> What type is this property?
>>>
>>> What are these values? Do you need to refer to a header somewhere?
>>
>> There is a header, introduced in this patch.
>> include/dt-bindings/thermal/thermal.h
>>
>>> Symbolic constants should have an associated value for an ABI.
>>
>> OK. Sounds reasonable to me. Shall we be descriptive and enlist the
>> values in this doc too or just pointing to header is fine?
>
> I believe that referring to the header is fine, though personally I'd
> prefer if we were explicit in binding docs. I'll leave Stephen to
> comment on what the preferred way of handling header constants with
> binding documents is.
>

OK. I will do both.

>>
>>>
>>>> +
>>>> +**********
>>>> +bind_param
>>>> +**********
>>>> +
>>>> +The bind_param node is a node to describe how cooling devices get assigned
>>>> +to trip points of the zone. The cooling devices are expected to be loaded
>>>> +in the target system.
>>>> +
>>>> +A node describing a bind_param must contain:
>>>> +- cooling_device: A string with the cooling device name.
>>>
>>> Please use '-' rather than '_' in bindings. That's the normal devicetree
>>> convention, and there's no point gonig against it and causing more
>>> confusion.
>>
>> Agreed.
>>
>>>
>>> What are valid values of this string, and what do they mean?
>>
>> These are the name of the cooling devices to match to.
>
> Which are what exactly? What constitutes a "cooling device"?
>
> Is "black-body radiation" [1] a valid cooling device?
>
> How about "copper heat sink"?
>

Those are example of cooling devices, yes. But in this context what is
interesting is those devices that allow you to control the amount of
dissipated power, i.e. operating frequency/voltage, lcd brightness,
audio volume, radio transmission power, or if those that you burn power
by turning them on but you remove heat out of your thermal zone, like a
speed controllable fan.

>>
>>>
>>> Why not a phandle + cells binding to cooling devices?
>>>
>>
>> Yeah, this could work too. I am going to check how that would look like
>> and come back to you.
>
> Ok.
>
>>
>>>> +- weight: The 'influence' of a particular cooling device on this zone.
>>>> + This is on a percentage scale. The sum of all these weights
>>>> + (for a particular zone) cannot exceed 100.
>>>
>>> This is a bit fuzzy, and certainly needs more guidance.
>>
>> OK. This property is used to describe the amount of cooling contribution
>> when activating a cooling device at a certain trip point. I will work on
>> a better guidance.
>>
>>>
>>>> +- trip_mask: This is a bit mask that gives the binding relation between
>>>> + this thermal zone and cdev, for a particular trip point.
>>>> + If nth bit is set, then the cdev and thermal zone are bound
>>>> + for trip point n.
>>>
>>> What type is this?
>>>
>>> The nth bit from which end?
>>>
>>> Could you explain what "bound for trip point n" means?
>>>
>>
>> It is just a way to match a cooling device with a set of trip points.
>> Say you want to match one specific cooling device with trips 0 and 1,
>> you would then create a bind_param with trip_mask = 0x3.
>>
>>> Just because this is a bitmask in Linux does not mean it needs to be so
>>> in the dt.
>>
>> Indeed.
>>
>> In case we work with phandle + cell binging parameters, the trip and
>> weight could be passed as params.
>>
>> cooling-devices = <&devX tripY weightZ>, <... >;
>
> Something like that could certainly work.

I see. I will check that option.

>
>>
>>>
>>>> +- limits: An array integers consisting of 2-tuples items, and each item means
>>>> + lower and upper state limits, like <min-state max-state>.
>>>> +
>>>> +**************************
>>>> +temperature sensor devices
>>>> +**************************
>>>> +
>>>> +Temperature sensor devices have to be linked to a specific thermal zone.
>>>> +This is done by means of the 'monitored-zones' list.
>>>> +- monitored-zones: A list of thermal zones phandles, identifying which
>>>> +thermal zones the temperature sensor device is used to monitor.
>>>
>>> Why does this need to be described that way around?
>>>
>>> Can the zone not have a link to the sensor(s) it needs?
>>>
>>
>> Do you see any issues with it?
>>
>> Idea is that sensor nodes would list which zones they are used. Same
>> idea goes to cooling devices. It can be the other way around, yes. It
>> depends on what is the best way to describe these relations. Idea would
>> be so that sensors and cooling devices would be composing a thermal zone.
>
> I think that having a zone refer to the sensors that monitor it would be
> preferable. Otherwise we're encoding that way a devices output is to be
> consumed on the binding for the device itself, which feels odd (it's
> arguable as to whether it's a property of the device, however).
>

I got confused now. How adding a monitored-zones property on sensor's
nodes would imply that device's output is to be consumed on the binding
for the device itself?

>>
>>>> +
>>>> +************
>>>> +thermal_zone
>>>> +************
>>>> +
>>>> +The thermal_zone node is the node containing all the required info
>>>> +for describing a thermal zone, including its cdev bindings. The thermal_zone
>>>> +node must contain, apart from its own properties, one node containing
>>>> +trip nodes and one node containing all the zone bind parameters.
>>>> +
>>>> +Required properties:
>>>> +- mask: Bit string: If 'n'th bit is set, then trip point 'n' is writeable.
>>>
>>> What type is a bit string in devicetree? I'm aware of cells and (byte)
>>> strings, but not bit strings.
>>>
>>> What does 'writeable' mean in this context?
>>>
>>> There's presumably a better name than 'mask'.
>>>
>>
>> This is used to describe if trip points are writeable under sysfs. I
>> think this is a OS implementation leak. I will be removing this from
>> device tree and making them RO by default.
>
> Ok.
>
>>
>>>> +
>>>> +- passive_delay: number of milliseconds to wait between polls when
>>>> + performing passive cooling.
>>>> +
>>>> +- polling_delay: number of milliseconds to wait between polls when checking.
>>>
>>> These sound very much like configuration of the driver rather than
>>> hardware description. What if my temperature sensor can generate an
>>> interrupt at some trigger temperature? Why would I need polling times?
>>>
>>> Why does this need to be in the dt at all? Surely the OS can choose some
>>> sensible polling period, possibly dynamically?
>>
>> Well, let me answer this one with another question. How the OS will
>> determine how fast is the temperature rise on a thermal zone described
>> in device tree?
>
> How will the devicetree writer determine this?
>
> It's possible for the thermal management code to dynamically adjust the
> rate between some extreme polling intervals based on how big a rise it
> detects between polls or possibly something more complex taking into
> account some metric for load.

It is possible yes, by means of a couple of complex differential
equations implementation. But still, determining the maximum dT/dt may
require unnecessary polling on low activity use cases. Besides, people
who do transient thermal simulation should have a better estimate on
necessary polling requirement. I don't see why we should not use that
handy info and impose hard software requirements.

>
>>
>>>
>>>> +
>>>> +- #thermal-cells: An integer that is used to specify how many cells compose
>>>> + sensor ids.
>>>
>>> I don't see why this is necessary. If the zone itself described its
>>> related sensors rather than the other way around, we wouldn't need it at
>>> all. Most bindings so far have consumers link to their providers rather
>>> than the other way around.
>>>
>>
>> I see.
>>
>>
>>>> +
>>>> +Below is an example:
>>>> +cpu_thermal: thermal_zone {
>>>> + #thermal-cells = <1>;
>>>> + mask = <0x03>; /* trips writability */
>>>> + passive_delay = <250>; /* milliseconds */
>>>> + polling_delay = <1000>; /* milliseconds */
>>>> + trips {
>>>> + alert@100000{
>>>> + temperature = <100000>; /* milliCelsius */
>>>> + hysteresis = <0>; /* milliCelsius */
>>>> + type = <THERMAL_TRIP_PASSIVE>;
>>>> + };
>>>> + crit@125000{
>>>> + temperature = <125000>; /* milliCelsius */
>>>> + hysteresis = <0>; /* milliCelsius */
>>>> + type = <THERMAL_TRIP_CRITICAL>;
>>>> + };
>
> I didn't spot this on my first round, but the trips node will need
> #address-cells and #size-cells, and the sub nodes unit-addresses should
> match their reg properties.

You mean, thermal zones must have address and size cells to describe trips?

>
>>>> + };
>>>> + bind_params {
>>>> + action@0{
>>>> + cooling_device = "thermal-cpufreq";
>>>
>>> NAK. That value was not defined anywhere, and represents Linux
>>> infrastructure rather than hardware.
>>>
>>
>> Yes, this was another leak as discussed above in the upper part of the
>> doc file.
>>
>>> For this case, we could just as easily describe the thermal points, and
>>> if no physical cooling device (e.g. a fan) is present, assume
>>> cpufreq-base cooling. Other OSs could choose to do otherwise, and we
>>> could change later...
>>>
>>
>> The way other specifications do (aka ACPI) is to have trip types. In
>> case a trip type is passive cooling, it means one is supposed to
>> modulate the dissipated power rather than activating a device to remove
>> heat (active cooling).
>>
>> The part of 'assuming' things that bothers me. I would rather have a way
>> to really say one wants to have passive cooling at a specific trip point.
>
> I don't see the problem here. If you don't specify a device, your only
> options will be some sort of passive management.

But it does not need to be active all the time. You need to say when (in
temperature domain) that is need. Besides, there are even other means of
managing average power other than cpufreq cooling. idle injections is
another passive cooling device, for instance.

>
>>
>>
>>>> + weight = <100>; /* percentage */
>>>> + mask = <0x01>;
>>>
>>> This should presumably be trip-mask (assuming my suggested corrections).
>>>
>>> What is it supposed to mean here?
>>
>> As stated above, it means the trip points that this binding represents.
>
> If you have more than 32 trip points (which is admittedly unlikely), how
> do you represent them? I'm not sure a mask is the best representation
> here, as it's difficult for a human to match up with a list of entries.
>
> Additionally, the trip points don't have a logical contiguous numbering
> from zero, so how do the bits in the mask correspond to entries in the
> list? This seems to rely on the order that the subnodes are in, which
> AFAIK is not guaranteed. It would be far nicer to have an explicit
> linkage.


Do you mean some sort of list of trip points property inside binding
params node?

>
>>
>>>
>>>> + limits = <
>>>> + /* lower-state upper-state */
>>>> + 1 THERMAL_NO_LIMITS
>>>> + THERMAL_NO_LIMITS THERMAL_NO_LIMITS
>>>> + >;
>>>> + };
>>>> + };
>>>> +};
>>>> +
>>>> +sensor: adc@0xFFFF {
>>>> + ...
>>>> + monitored-zones = <&cpu_thermal 0>;
>>>> +};
>>>> +
>>>> +cpu@0 {
>>>> + ...
>>>> + cooling-zones = <&cpu_thermal>;
>>>> +};
>>>> +
>>>> +In the example above, the ADC sensor at address 0xFFFF is used to monitor
>>>> +the zone 'cpu_thermal' using its the sensor 0. The cpu@0 device is also linked
>>>> +to the same thermal zone 'cpu_thermal' as a cooling device.
>>>
>>> Huh? That seems to imply the the '0' in '<&cpu_thermal 0>' on the sensor
>>> node describes a property of the sensor rather than the thermal node,
>>> unless I've misunderstood?
>>>
>>
>> It is essentially which sensor id is used to monitor a specific zone. In
>> case of sensor IPs that feature several internal sensors, one needs to
>> have a way to describe which sensor monitors which zone. Example of this
>> type of IPs: lm90 (up to 3 sensors) and TI bandgap IP (can contain up to
>> 5 internal sensors). The probe happens for the IP device and not for the
>> internal sensors.
>
> Which is exactly why this should be the opposite way round, with thermal
> zones referring to sensors in this fashion.
>
> The way you're suggesting means that the phandle+cells specifier refers
> half to the consumer and half to the producer. Having thermal zones
> refer to sensors would make this refer solely to the producer (the
> sensor).
>

Are you suggesting having inside thermal zone node:

sensors = <&sensor0 Y>, ...., <&sensorN Z> ;

Where sensor0 and sensorN are sensor IPs nodes and Y and Z represent
their internal sensor instances?

>>
>>
>>>> +
>>>> +***************
>>>> +cpufreq-cooling
>>>> +***************
>>>> +
>>>> +The generic cpu cooling (freq clipping) can be loaded by the
>>>> +generic cpufreq-cpu0 driver in case the device tree node informs
>>>> +this need.
>>>> +
>>>> +In order to load the cpufreq cooling device, it is possible to
>>>> +inform that the cpu needs cooling by adding the list of thermal zones
>>>> +in the 'cooling-zones' property at the cpu0 phandle.
>>>
>>> This should be reworded so as to describe the binding rather than the
>>> Linux behaviour based on the binding.
>>
>> OK.
>>
>>>
>>>> +
>>>> +Example:
>>>> + cpus {
>>>> + cpu@0 {
>>>> + operating-points = <
>>>> + /* kHz uV */
>>>> + 800000 1313000
>>>> + 1008000 1375000
>>>> + >;
>>>> + cooling-zones = <&cpu_thermal>;
>>>> + };
>>>> + ...
>>>> + };
>>>> +
>>>> +The above will cause the cpufreq-cpu0 driver to understand that
>>>> +the cpu will need passive cooling and while probing that node it will
>>>> +also load the cpufreq cpu cooling device in that system.
>>>
>>> The above describe that the CPU is part of a cooling (thermal?) zone and
>>> requires cooling.
>>
>> That says the cpu participates in a thermal zone as a cooling
>> device/mechanism, thus cpu cooling (passive cooling) is required in
>> target system.
>
> I think you need the rest of the nodes in this example. You don't show
> whether or not there's a cooling-device, so it's not clear that the CPU
> is required to be passively cooled...

The property states the cpu is used as cooling device for those thermal
zones in the list 'cooling-zones'

>
>>
>>>
>>>> +
>>>> +However, be advised that this flag will not describe what needs to
>>>> +be done or how the cpufreq cooling device will be used. In order to
>>>> +accomplish this, one needs to write a phandle for a thermal zone, as
>>>> +described in the section 'thermal_zone'.
>>>
>>> The above already has a phandle to a thermal (cooling?) zone.
>>>
>>> Please choose either cooling zone or thermal zone, having the two names
>>> makes this more confusing than necessary.
>>
>> I see. Idea is to have thermal zone as a node. cooling-zone is just a
>> list of thermal zones that the referred device will be used as cooling
>> device.
>
> While I can see a distinction, the nomenclature is horrendously
> confusing.


OK. I will try harder to improve it.

>
>>
>>>
>>>> diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
>>>> index 7fb16bc..753f0dc 100644
>>>> --- a/drivers/thermal/Kconfig
>>>> +++ b/drivers/thermal/Kconfig
>>>> @@ -29,6 +29,19 @@ config THERMAL_HWMON
>>>> Say 'Y' here if you want all thermal sensors to
>>>> have hwmon sysfs interface too.
>>>>
>>>> +config THERMAL_OF
>>>> + bool
>>>> + prompt "APIs to parse thermal data out of device tree"
>>>> + depends on OF
>>>> + default y
>>>> + help
>>>> + This options provides helpers to add the support to
>>>> + read and parse thermal data definitions out of the
>>>> + device tree blob.
>>>> +
>>>> + Say 'Y' here if you need to build thermal infrastructure
>>>> + based on device tree.
>>>> +
>>>> choice
>>>> prompt "Default Thermal governor"
>>>> default THERMAL_DEFAULT_GOV_STEP_WISE
>>>> diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
>>>> index 24cb894..eedb273 100644
>>>> --- a/drivers/thermal/Makefile
>>>> +++ b/drivers/thermal/Makefile
>>>> @@ -7,6 +7,7 @@ thermal_sys-y += thermal_core.o
>>>>
>>>> # interface to/from other layers providing sensors
>>>> thermal_sys-$(CONFIG_THERMAL_HWMON) += thermal_hwmon.o
>>>> +thermal_sys-$(CONFIG_THERMAL_OF) += thermal_dt.o
>>>>
>>>> # governors
>>>> thermal_sys-$(CONFIG_THERMAL_GOV_FAIR_SHARE) += fair_share.o
>>>> diff --git a/drivers/thermal/thermal_dt.c b/drivers/thermal/thermal_dt.c
>>>> new file mode 100644
>>>> index 0000000..cc35485
>>>> --- /dev/null
>>>> +++ b/drivers/thermal/thermal_dt.c
>>>> @@ -0,0 +1,473 @@
>>>> +/*
>>>> + * thermal_dt.c - Generic Thermal Management device tree support.
>>>> + *
>>>> + * Copyright (C) 2013 Texas Instruments
>>>> + * Copyright (C) 2013 Eduardo Valentin <[email protected]>
>>>> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>> + *
>>>> + * This program is free software; you can redistribute it and/or modify
>>>> + * it under the terms of the GNU General Public License as published by
>>>> + * the Free Software Foundation; version 2 of the License.
>>>> + *
>>>> + * This program is distributed in the hope that it will be useful, but
>>>> + * WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>>>> + * General Public License for more details.
>>>> + *
>>>> + * You should have received a copy of the GNU General Public License along
>>>> + * with this program; if not, write to the Free Software Foundation, Inc.,
>>>> + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
>>>> + *
>>>> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>> + */
>>>> +#include <linux/thermal.h>
>>>> +#include <linux/slab.h>
>>>> +#include <linux/types.h>
>>>> +#include <linux/of_device.h>
>>>> +#include <linux/of_platform.h>
>>>> +#include <linux/err.h>
>>>> +#include <linux/export.h>
>>>> +#include <linux/string.h>
>>>> +
>>>> +struct __thermal_bind_params {
>>>> + char cooling_device[THERMAL_NAME_LENGTH];
>>>> +};
>>>> +
>>>> +static
>>>> +int thermal_of_match(struct thermal_zone_device *tz,
>>>> + struct thermal_cooling_device *cdev);
>>>> +
>>>> +static int thermal_of_populate_bind_params(struct device *dev,
>>>> + struct device_node *node,
>>>> + struct __thermal_bind_params *__tbp,
>>>> + struct thermal_bind_params *tbp,
>>>> + int ntrips)
>>>> +{
>>>> + const char *cdev_name;
>>>> + u32 *limits;
>>>> + int ret;
>>>> + u32 prop;
>>>> +
>>>> + ret = of_property_read_u32(node, "weight", &prop);
>>>> + if (ret < 0) {
>>>> + dev_err(dev, "missing weight property\n");
>>>> + return ret;
>>>> + }
>>>> + tbp->weight = prop;
>>>> +
>>>> + ret = of_property_read_u32(node, "mask", &prop);
>>>> + if (ret < 0) {
>>>> + dev_err(dev, "missing mask property\n");
>>>> + return ret;
>>>> + }
>>>> + tbp->trip_mask = prop;
>>>> +
>>>> + /* this will allow us to bind with cooling devices */
>>>> + tbp->match = thermal_of_match;
>>>> +
>>>> + ret = of_property_read_string(node, "cooling_device", &cdev_name);
>>>> + if (ret < 0) {
>>>> + dev_err(dev, "missing cooling_device property\n");
>>>> + return ret;
>>>> + }
>>>> + strncpy(__tbp->cooling_device, cdev_name,
>>>> + sizeof(__tbp->cooling_device));
>>>> +
>>>> + limits = kzalloc(ntrips * sizeof(*limits) * 2, GFP_KERNEL);
>>>
>>> Why do you use kzalloc here, but devm_kzalloc elsewhere? That'll lead to
>>> problems as the driver gets refactored over time.
>>
>> Because this memory is used only within this function and thus released
>> at the bottom of it.
>>
>>>
>>>> + if (!limits) {
>>>> + dev_err(dev, "not enough mem for reading limits\n");
>>>> + return -ENOMEM;
>>>> + }
>>>> + ret = of_property_read_u32_array(node, "limits", limits, 2 * ntrips);
>>>> + if (!ret) {
>>>> + int i = ntrips;
>>>> +
>>>> + tbp->binding_limits = devm_kzalloc(dev,
>>>> + ntrips * 2 *
>>>> + sizeof(*tbp->binding_limits),
>>>> + GFP_KERNEL);
>>>> + if (!tbp->binding_limits) {
>>>> + dev_err(dev, "not enough mem for storing limits\n");
>>>> + return -ENOMEM;
>>>> + }
>>>> + while (i--)
>>>> + tbp->binding_limits[i] = limits[i];
>>>> + }
>>>> + kfree(limits);
>>
>> here.
>
> I saw that, I did say that this may lead to problems in future if the
> function's reorganised, not now. :)

Ahh.. OK. I just didnt want o let the memory be managed and freed only
when the struct device is release.

>
>>
>>>> +
>>>> + return 0;
>>>
>>> What if you failed to read the array? Surely you should return an error?
>>
>> Attempting to make this property not mandatory.
>
> That wasn't clear from the binding, and now I've re-read it, I'm still
> not entirely clear on what the limits represent, and why you may have
> multiple sets of pairs.

Because you may assign cooling state limits to specific trip points,
e.g., a fan with 5 different speeds would operate from speed 0 to 2 in
the first trip point, while from speed 3 to 4 on last trip point.

>
>>
>>>
>>>> +}
>>>> +
>>>> +struct __thermal_trip {
>>>> + unsigned long int temperature;
>>>> + unsigned long int hysteresis;
>>>> + enum thermal_trip_type type;
>>>> +};
>>>> +
>>>> +static
>>>> +int thermal_of_populate_trip(struct device *dev,
>>>> + struct device_node *node,
>>>> + struct __thermal_trip *trip)
>>>> +{
>>>> + int ret;
>>>> + int prop;
>>>> +
>>>> + ret = of_property_read_u32(node, "temperature", &prop);
>>>> + if (ret < 0) {
>>>> + dev_err(dev, "missing temperature property\n");
>>>> + return ret;
>>>> + }
>>>> + trip->temperature = prop;
>>>> +
>>>> + ret = of_property_read_u32(node, "hysteresis", &prop);
>>>> + if (ret < 0) {
>>>> + dev_err(dev, "missing hysteresis property\n");
>>>> + return ret;
>>>> + }
>>>> + trip->hysteresis = prop;
>>>> +
>>>> + ret = of_property_read_u32(node, "type", &prop);
>>>> + if (ret < 0) {
>>>> + dev_err(dev, "missing type property\n");
>>>> + return ret;
>>>> + }
>>>> + trip->type = prop;
>>>
>>> This will require sanity checking.
>>
>> OK.
>>
>>>
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> +struct __thermal_zone_device {
>>>> + enum thermal_device_mode mode;
>>>> + int passive_delay;
>>>> + int polling_delay;
>>>> + int mask;
>>>> + int ntrips;
>>>> + char type[THERMAL_NAME_LENGTH];
>>>> + struct __thermal_trip *trips;
>>>> + struct __thermal_bind_params *bind_params;
>>>> + struct thermal_bind_params *tbps;
>>>> + struct thermal_zone_params zone_params;
>>>> + int (*get_temp)(void *, unsigned long *);
>>>> + void *devdata;
>>>> +};
>>>> +
>>>> +static struct __thermal_zone_device *
>>>> +thermal_of_build_thermal_zone(struct device *dev, struct device_node *node)
>>>> +{
>>>> + struct device_node *child, *gchild;
>>>> + struct __thermal_zone_device *tz;
>>>> + int ret, i;
>>>> + u32 prop;
>>>> +
>>>> + if (!node) {
>>>> + dev_err(dev, "no thermal zone node\n");
>>>> + return ERR_PTR(-EINVAL);
>>>> + }
>>>> +
>>>> + tz = devm_kzalloc(dev, sizeof(*tz), GFP_KERNEL);
>>>> + if (!tz) {
>>>> + dev_err(dev, "not enough memory for thermal of zone\n");
>>>> + return ERR_PTR(-ENOMEM);
>>>> + }
>>>> +
>>>> + ret = of_property_read_u32(node, "passive_delay", &prop);
>>>> + if (ret < 0) {
>>>> + dev_err(dev, "missing passive_delay property\n");
>>>> + return ERR_PTR(ret);
>>>> + }
>>>> + tz->passive_delay = prop;
>>>> +
>>>> + ret = of_property_read_u32(node, "polling_delay", &prop);
>>>> + if (ret < 0) {
>>>> + dev_err(dev, "missing polling_delay property\n");
>>>> + return ERR_PTR(ret);
>>>> + }
>>>> + tz->polling_delay = prop;
>>>> +
>>>> + ret = of_property_read_u32(node, "mask", &prop);
>>>> + if (ret < 0) {
>>>> + dev_err(dev, "missing mask property\n");
>>>> + return ERR_PTR(ret);
>>>> + }
>>>> + tz->mask = prop;
>>>> +
>>>> + /* assume node name as our zone type */
>>>> + strncpy(tz->type, node->name, sizeof(tz->type));
>>>> +
>>>> + /* default policy */
>>>> + strncpy(tz->zone_params.governor_name, "step_wise",
>>>> + sizeof(tz->zone_params.governor_name));
>>>> +
>>>> + /* trips */
>>>> + child = of_find_node_by_name(node, "trips");
>>>
>>> That might not be a child node, as of_find_node_by_name walks over the
>>> list of all nodes. Use of_get_child_by_name.
>>
>> Right! This is a bug.
>>
>>>
>>>> +
>>>> + /* No trips provided */
>>>> + if (!child)
>>>> + goto finish;
>>>> +
>>>> + tz->ntrips = of_get_child_count(child);
>>>> + tz->trips = devm_kzalloc(dev, tz->ntrips * sizeof(*tz->trips),
>>>> + GFP_KERNEL);
>>>> + if (!tz->trips)
>>>> + return ERR_PTR(-ENOMEM);
>>>> + i = 0;
>>>> + for_each_child_of_node(child, gchild)
>>>> + thermal_of_populate_trip(dev, gchild, &tz->trips[i++]);
>>>
>>> What if a child node was not a valid trip node? It seems
>>> thermal_of_populate_trip returns errors you're ignoring.
>>
>> OK. Adding proper treatment.
>>
>>>
>>>> +
>>>> + /* bind_params */
>>>> + child = of_find_node_by_name(node, "bind_params");
>>>
>>> Another of_get_child_by_name candidate.
>>
>> OK.
>>
>>>
>>>> +
>>>> + /* No binding info */
>>>> + if (!child)
>>>> + goto finish;
>>>> +
>>>> + tz->zone_params.num_tbps = of_get_child_count(child);
>>>> + tz->bind_params = devm_kzalloc(dev,
>>>> + tz->zone_params.num_tbps *
>>>> + sizeof(*tz->bind_params),
>>>> + GFP_KERNEL);
>>>> + if (!tz->bind_params)
>>>> + return ERR_PTR(-ENOMEM);
>>>> + tz->zone_params.tbp = devm_kzalloc(dev,
>>>> + tz->zone_params.num_tbps *
>>>> + sizeof(*tz->zone_params.tbp),
>>>> + GFP_KERNEL);
>>>> + if (!tz->zone_params.tbp)
>>>> + return ERR_PTR(-ENOMEM);
>>>> + i = 0;
>>>> + for_each_child_of_node(child, gchild) {
>>>> + thermal_of_populate_bind_params(dev, gchild,
>>>> + &tz->bind_params[i],
>>>> + &tz->zone_params.tbp[i],
>>>> + tz->ntrips);
>>>
>>> Similarly, you're ignoring error conditions here. What if a bind_params
>>> node was malformed?
>>
>>
>> Adding proper error handling.
>>
>>>
>>>> + i++;
>>>> + }
>>>> +
>>>> +finish:
>>>> + tz->mode = THERMAL_DEVICE_ENABLED;
>>>> +
>>>> + return tz;
>>>> +}
>>>> +
>>>> +static
>>>> +int thermal_of_match(struct thermal_zone_device *tz,
>>>> + struct thermal_cooling_device *cdev)
>>>> +{
>>>> + struct __thermal_zone_device *data = tz->devdata;
>>>> + int i;
>>>> +
>>>> + for (i = 0; i < data->zone_params.num_tbps; i++) {
>>>> + if (!strncmp(data->bind_params[i].cooling_device,
>>>> + cdev->type,
>>>> + strlen(data->bind_params[i].cooling_device)) &&
>>>> + (data->zone_params.tbp[i].trip_mask & (1 << i)))
>>>> + return 0;
>>>> + }
>>>> +
>>>> + return -EINVAL;
>>>> +}
>>>
>>> I'm not keen on binding to cooling devices using a string, as suddenly a
>>> name to allow humans to inspect the system is turned into an ABI.
>>
>> OK. We need to align on how we describe the binding then.
>
> Definitely. We need to ensure that we fully define the relationship too
> cooling devices, and don't suddenly expose some piece of internal Linux
> infrastructure to the world.
>

Yeah, agreed, any suggestions?

>>
>>>
>>>> +
>>>> +static
>>>> +int of_thermal_get_temp(struct thermal_zone_device *tz,
>>>> + unsigned long *temp)
>>>> +{
>>>> + struct __thermal_zone_device *data = tz->devdata;
>>>> +
>>>> + return data->get_temp(data->devdata, temp);
>>>> +}
>>>> +
>>>> +static
>>>> +int of_thermal_get_mode(struct thermal_zone_device *tz,
>>>> + enum thermal_device_mode *mode)
>>>> +{
>>>> + struct __thermal_zone_device *data = tz->devdata;
>>>> +
>>>> + *mode = data->mode;
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> +static
>>>> +int of_thermal_set_mode(struct thermal_zone_device *tz,
>>>> + enum thermal_device_mode mode)
>>>> +{
>>>> + struct __thermal_zone_device *data = tz->devdata;
>>>> +
>>>> + mutex_lock(&tz->lock);
>>>> +
>>>> + if (mode == THERMAL_DEVICE_ENABLED)
>>>> + tz->polling_delay = data->polling_delay;
>>>> + else
>>>> + tz->polling_delay = 0;
>>>> +
>>>> + mutex_unlock(&tz->lock);
>>>> +
>>>> + data->mode = mode;
>>>> + thermal_zone_device_update(tz);
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> +static
>>>> +int of_thermal_get_trip_type(struct thermal_zone_device *tz, int trip,
>>>> + enum thermal_trip_type *type)
>>>> +{
>>>> + struct __thermal_zone_device *data = tz->devdata;
>>>> +
>>>> + if (trip >= data->ntrips || trip < 0)
>>>> + return -EDOM;
>>>
>>> It's far more common to use -EINVAL.
>>
>> hmm.. OK.
>>
>>>
>>>> +
>>>> + *type = data->trips[trip].type;
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> +static
>>>> +int of_thermal_get_trip_temp(struct thermal_zone_device *tz, int trip,
>>>> + unsigned long *temp)
>>>> +{
>>>> + struct __thermal_zone_device *data = tz->devdata;
>>>> +
>>>> + if (trip >= data->ntrips || trip < 0)
>>>> + return -EDOM;
>>>> +
>>>> + *temp = data->trips[trip].temperature;
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> +static
>>>> +int of_thermal_set_trip_temp(struct thermal_zone_device *tz, int trip,
>>>> + unsigned long temp)
>>>> +{
>>>> + struct __thermal_zone_device *data = tz->devdata;
>>>> +
>>>> + if (trip >= data->ntrips || trip < 0)
>>>> + return -EDOM;
>>>> +
>>>> + /* thermal fw should take care of data->mask & (1 << trip) */
>>>> + data->trips[trip].temperature = temp;
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> +static
>>>> +int of_thermal_get_trip_hyst(struct thermal_zone_device *tz, int trip,
>>>> + unsigned long *hyst)
>>>> +{
>>>> + struct __thermal_zone_device *data = tz->devdata;
>>>> +
>>>> + if (trip >= data->ntrips || trip < 0)
>>>> + return -EDOM;
>>>> +
>>>> + *hyst = data->trips[trip].hysteresis;
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> +static
>>>> +int of_thermal_set_trip_hyst(struct thermal_zone_device *tz, int trip,
>>>> + unsigned long hyst)
>>>> +{
>>>> + struct __thermal_zone_device *data = tz->devdata;
>>>> +
>>>> + if (trip >= data->ntrips || trip < 0)
>>>> + return -EDOM;
>>>> +
>>>> + /* thermal fw should take care of data->mask & (1 << trip) */
>>>
>>> Could you elaborate on that comment?
>>
>> In case the trip is not assigned to be writable, this function won't be
>> called.
>
> Ah. I misunderstood "fw" as "firmware", rather than "framework", and got
> confused. It would be nice if you could expand "fw" to make this
> clearer. :)

Sorry for that :-)

>
> Thanks,
> Mark.
>
> [1] http://en.wikipedia.org/wiki/Black-body_radiation
>
>


--
You have got to be excited about what you are doing. (L. Lamport)

Eduardo Valentin


Attachments:
signature.asc (295.00 B)
OpenPGP digital signature

2013-08-29 23:20:58

by Eduardo Valentin

[permalink] [raw]
Subject: Re: [RFC PATCH 02/14] drivers: thermal: introduce device tree parser

Mark, Pawel and Stephen,


On 27-08-2013 14:17, Eduardo Valentin wrote:
> On 27-08-2013 12:23, Mark Rutland wrote:
>> On Tue, Aug 27, 2013 at 02:44:40PM +0100, Eduardo Valentin wrote:
>>> Hello Mark,

<cut>

I believe now we need to align on thermal bindings, before I propose
next version of this series. So, following the discussions on this
thread, I believe a typical CPU thermal zone would look like the following:
+ #include <dt-bindings/thermal/thermal.h>
+
+ cpu_thermal: cpu_thermal {
+ passive-delay = <250>; /* milliseconds */
+ polling-delay = <1000>; /* milliseconds */
+ /*
+ * sensor ID
+ */
+ sensors = <&bandgap0 0>;
+ /*
+ * cooling
+ * device Usage Trip lower
bound upper bound
+ */
+ cooling-devices = <&cpu0 100 &cpu-alert
THERMAL_NO_LIMITS THERMAL_NO_LIMITS>;
+ trips {
+ cpu-alert: cpu-alert {
+ temperature = <100000>; /* milliCelsius */
+ hysteresis = <2000>; /* milliCelsius */
+ type = <THERMAL_TRIP_PASSIVE>;
+ };
+ cpu-crit: cpu-crit {
+ temperature = <125000>; /* milliCelsius */
+ hysteresis = <2000>; /* milliCelsius */
+ type = <THERMAL_TRIP_CRITICAL>;
+ };
+ };
+ };

*Note: THERMAL_NO_LIMIT means, it will be using the cooling device
minimum and maximum limits.

Couple of comments.
1. I am keeping the pooling intervals. A possible alternative way to
describe that would be specifying the maximum dT/dt. Essentially because
I am still convinced that this is part of hw specs.

2. The above follows your suggestion to use consumer pointing to
producers. Although, I still need to figure out how this could be
implemented for Linux. But I think that is another story, at least for
now. We need to first align on the DT binding itself.

3. The link from cooling device specification to trip points, from my
perspective, should be enough, and thus we shall not need the thermal
cells and size for trip points, as you proposed. Let me know what you think.

Below is another example, on a more complex scenario, board specific
case (hypothetical, just for exemplification):

+ #include <dt-bindings/thermal/thermal.h>
+
+ board_thermal: board_thermal {
+ passive-delay = <1000>; /* milliseconds */
+ polling-delay = <2500>; /* milliseconds */
+ /*
+ * sensor ID
+ */
+ sensors = <&adc-dummy 0>,
+ <&adc-dummy 1>,
+ <&adc-dymmy 2>;
+ /*
+ * cooling
+ * device Usage Trip lower upper
+ */
+ cooling-devices = <&cpu0 75 &cpu-trip 0 2>,
+ <&gpu0 40 &gpu-trip 0 2>;
+ <&lcd0 25 &lcd-trip 5 10>;
+ trips {
+ cpu-trip: cpu-trip {
+ temperature = <60000>; /* milliCelsius */
+ hysteresis = <2000>; /* milliCelsius */
+ type = <THERMAL_TRIP_PASSIVE>;
+ };
+ gpu-trip: gpu-trip {
+ temperature = <55000>; /* milliCelsius */
+ hysteresis = <2000>; /* milliCelsius */
+ type = <THERMAL_TRIP_PASSIVE>;
+ }
+ lcd-trip: lcp-trip {
+ temperature = <53000>; /* milliCelsius */
+ hysteresis = <2000>; /* milliCelsius */
+ type = <THERMAL_TRIP_PASSIVE>;
+ };
+ crit-trip: crit-trip {
+ temperature = <68000>; /* milliCelsius */
+ hysteresis = <2000>; /* milliCelsius */
+ type = <THERMAL_TRIP_CRITICAL>;
+ };
+ };
+ };

Now writing the above example, one might want to have a way to say the
sensor correlation equation.

Another example:
+ #include <dt-bindings/thermal/thermal.h>
+
+ dsp_thermal: dsp_thermal {
+ passive-delay = <250>; /* milliseconds */
+ polling-delay = <1000>; /* milliseconds */
+ /*
+ * sensor ID
+ */
+ sensors = <&bandgap0 1>;
+ /*
+ * cooling
+ * device Usage Trip lower
bound upper bound
+ */
+ cooling-devices = <&dsp0 100 &dsp-alert
THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
+ trips {
+ dsp-alert: dsp-alert {
+ temperature = <100000>; /* milliCelsius */
+ hysteresis = <2000>; /* milliCelsius */
+ type = <THERMAL_TRIP_PASSIVE>;
+ };
+ dsp-crit: cdsp-crit {
+ temperature = <125000>; /* milliCelsius */
+ hysteresis = <2000>; /* milliCelsius */
+ type = <THERMAL_TRIP_CRITICAL>;
+ };
+ };
+ };
+
+ gpu_thermal: gpu_thermal {
+ passive-delay = <500>; /* milliseconds */
+ polling-delay = <1000>; /* milliseconds */
+ /*
+ * sensor ID
+ */
+ sensors = <&bandgap0 2>;
+ /*
+ * cooling
+ * device Usage Trip lower
bound upper bound
+ */
+ cooling-devices = <&gpu0 100 &gpu-alert
THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
+ trips {
+ gpu-alert: gpu-alert {
+ temperature = <100000>; /* milliCelsius */
+ hysteresis = <2000>; /* milliCelsius */
+ type = <THERMAL_TRIP_PASSIVE>;
+ };
+ gpu-crit: gpu-crit {
+ temperature = <125000>; /* milliCelsius */
+ hysteresis = <2000>; /* milliCelsius */
+ type = <THERMAL_TRIP_CRITICAL>;
+ };
+ };
+ };


Wei, I think the above would cover for the IPs with multiple internal
sensors cases you were looking for, right?

Durga, please also check if I am missing something to cover for your
plans to, in case you will be using DT to describe your HW.

Anyways, Mark and Pawel, let me know if I missed something from our
discussion. I believe the above bindings would look more like regular
standard DT bindings. And also they should be now slim enough, without
Linux implementation details and also without policies.

>
>


--
You have got to be excited about what you are doing. (L. Lamport)

Eduardo Valentin


Attachments:
signature.asc (295.00 B)
OpenPGP digital signature

2013-09-02 08:13:41

by Wei Ni

[permalink] [raw]
Subject: Re: [RFC PATCH 02/14] drivers: thermal: introduce device tree parser

On 08/30/2013 07:19 AM, Eduardo Valentin wrote:
> * PGP Signed: 08/29/2013 at 04:19:46 PM
>
> Mark, Pawel and Stephen,
>
>
> On 27-08-2013 14:17, Eduardo Valentin wrote:
>> On 27-08-2013 12:23, Mark Rutland wrote:
>>> On Tue, Aug 27, 2013 at 02:44:40PM +0100, Eduardo Valentin wrote:
>>>> Hello Mark,
>
> <cut>
>
> I believe now we need to align on thermal bindings, before I propose
> next version of this series. So, following the discussions on this
> thread, I believe a typical CPU thermal zone would look like the following:
> + #include <dt-bindings/thermal/thermal.h>
> +
> + cpu_thermal: cpu_thermal {
> + passive-delay = <250>; /* milliseconds */
> + polling-delay = <1000>; /* milliseconds */
> + /*
> + * sensor ID
> + */
> + sensors = <&bandgap0 0>;
> + /*
> + * cooling
> + * device Usage Trip lower
> bound upper bound
> + */
> + cooling-devices = <&cpu0 100 &cpu-alert
> THERMAL_NO_LIMITS THERMAL_NO_LIMITS>;
> + trips {
> + cpu-alert: cpu-alert {
> + temperature = <100000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_PASSIVE>;
> + };
> + cpu-crit: cpu-crit {
> + temperature = <125000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_CRITICAL>;
> + };
> + };
> + };
>
> *Note: THERMAL_NO_LIMIT means, it will be using the cooling device
> minimum and maximum limits.
>
> Couple of comments.
> 1. I am keeping the pooling intervals. A possible alternative way to
> describe that would be specifying the maximum dT/dt. Essentially because
> I am still convinced that this is part of hw specs.
>
> 2. The above follows your suggestion to use consumer pointing to
> producers. Although, I still need to figure out how this could be
> implemented for Linux. But I think that is another story, at least for
> now. We need to first align on the DT binding itself.
>
> 3. The link from cooling device specification to trip points, from my
> perspective, should be enough, and thus we shall not need the thermal
> cells and size for trip points, as you proposed. Let me know what you think.
>
> Below is another example, on a more complex scenario, board specific
> case (hypothetical, just for exemplification):
>
> + #include <dt-bindings/thermal/thermal.h>
> +
> + board_thermal: board_thermal {
> + passive-delay = <1000>; /* milliseconds */
> + polling-delay = <2500>; /* milliseconds */
> + /*
> + * sensor ID
> + */
> + sensors = <&adc-dummy 0>,
> + <&adc-dummy 1>,
> + <&adc-dymmy 2>;
> + /*
> + * cooling
> + * device Usage Trip lower upper
> + */
> + cooling-devices = <&cpu0 75 &cpu-trip 0 2>,
> + <&gpu0 40 &gpu-trip 0 2>;
> + <&lcd0 25 &lcd-trip 5 10>;
> + trips {
> + cpu-trip: cpu-trip {
> + temperature = <60000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_PASSIVE>;
> + };
> + gpu-trip: gpu-trip {
> + temperature = <55000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_PASSIVE>;
> + }
> + lcd-trip: lcp-trip {
> + temperature = <53000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_PASSIVE>;
> + };
> + crit-trip: crit-trip {
> + temperature = <68000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_CRITICAL>;
> + };
> + };
> + };
>
> Now writing the above example, one might want to have a way to say the
> sensor correlation equation.
>
> Another example:
> + #include <dt-bindings/thermal/thermal.h>
> +
> + dsp_thermal: dsp_thermal {
> + passive-delay = <250>; /* milliseconds */
> + polling-delay = <1000>; /* milliseconds */
> + /*
> + * sensor ID
> + */
> + sensors = <&bandgap0 1>;
> + /*
> + * cooling
> + * device Usage Trip lower
> bound upper bound
> + */
> + cooling-devices = <&dsp0 100 &dsp-alert
> THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
> + trips {
> + dsp-alert: dsp-alert {
> + temperature = <100000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_PASSIVE>;
> + };
> + dsp-crit: cdsp-crit {
> + temperature = <125000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_CRITICAL>;
> + };
> + };
> + };
> +
> + gpu_thermal: gpu_thermal {
> + passive-delay = <500>; /* milliseconds */
> + polling-delay = <1000>; /* milliseconds */
> + /*
> + * sensor ID
> + */
> + sensors = <&bandgap0 2>;
> + /*
> + * cooling
> + * device Usage Trip lower
> bound upper bound
> + */
> + cooling-devices = <&gpu0 100 &gpu-alert
> THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
> + trips {
> + gpu-alert: gpu-alert {
> + temperature = <100000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_PASSIVE>;
> + };
> + gpu-crit: gpu-crit {
> + temperature = <125000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_CRITICAL>;
> + };
> + };
> + };
>
>
> Wei, I think the above would cover for the IPs with multiple internal
> sensors cases you were looking for, right?
Yes, I think so, and this series is what I'm looking for, thanks again.

>
> Durga, please also check if I am missing something to cover for your
> plans to, in case you will be using DT to describe your HW.
>
> Anyways, Mark and Pawel, let me know if I missed something from our
> discussion. I believe the above bindings would look more like regular
> standard DT bindings. And also they should be now slim enough, without
> Linux implementation details and also without policies.
>
>>
>>
>
>

2013-09-02 16:29:56

by Eduardo Valentin

[permalink] [raw]
Subject: Re: [RFC PATCH 02/14] drivers: thermal: introduce device tree parser

On 29-08-2013 19:19, Eduardo Valentin wrote:
> Mark, Pawel and Stephen,
>
>
> On 27-08-2013 14:17, Eduardo Valentin wrote:
>> On 27-08-2013 12:23, Mark Rutland wrote:
>>> On Tue, Aug 27, 2013 at 02:44:40PM +0100, Eduardo Valentin wrote:
>>>> Hello Mark,
>
> <cut>
>
> I believe now we need to align on thermal bindings, before I propose
> next version of this series. So, following the discussions on this
> thread, I believe a typical CPU thermal zone would look like the following:
> + #include <dt-bindings/thermal/thermal.h>
> +
> + cpu_thermal: cpu_thermal {
> + passive-delay = <250>; /* milliseconds */
> + polling-delay = <1000>; /* milliseconds */
> + /*
> + * sensor ID
> + */
> + sensors = <&bandgap0 0>;
> + /*
> + * cooling
> + * device Usage Trip lower
> bound upper bound
> + */
> + cooling-devices = <&cpu0 100 &cpu-alert
> THERMAL_NO_LIMITS THERMAL_NO_LIMITS>;
> + trips {
> + cpu-alert: cpu-alert {
> + temperature = <100000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_PASSIVE>;
> + };
> + cpu-crit: cpu-crit {
> + temperature = <125000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_CRITICAL>;
> + };
> + };
> + };
>
> *Note: THERMAL_NO_LIMIT means, it will be using the cooling device
> minimum and maximum limits.
>
> Couple of comments.
> 1. I am keeping the pooling intervals. A possible alternative way to
> describe that would be specifying the maximum dT/dt. Essentially because
> I am still convinced that this is part of hw specs.
>
> 2. The above follows your suggestion to use consumer pointing to
> producers. Although, I still need to figure out how this could be
> implemented for Linux. But I think that is another story, at least for
> now. We need to first align on the DT binding itself.
>
> 3. The link from cooling device specification to trip points, from my
> perspective, should be enough, and thus we shall not need the thermal
> cells and size for trip points, as you proposed. Let me know what you think.
>
> Below is another example, on a more complex scenario, board specific
> case (hypothetical, just for exemplification):
>
> + #include <dt-bindings/thermal/thermal.h>
> +
> + board_thermal: board_thermal {
> + passive-delay = <1000>; /* milliseconds */
> + polling-delay = <2500>; /* milliseconds */
> + /*
> + * sensor ID
> + */
> + sensors = <&adc-dummy 0>,
> + <&adc-dummy 1>,
> + <&adc-dymmy 2>;
> + /*
> + * cooling
> + * device Usage Trip lower upper
> + */
> + cooling-devices = <&cpu0 75 &cpu-trip 0 2>,
> + <&gpu0 40 &gpu-trip 0 2>;
> + <&lcd0 25 &lcd-trip 5 10>;
> + trips {
> + cpu-trip: cpu-trip {
> + temperature = <60000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_PASSIVE>;
> + };
> + gpu-trip: gpu-trip {
> + temperature = <55000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_PASSIVE>;
> + }
> + lcd-trip: lcp-trip {
> + temperature = <53000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_PASSIVE>;
> + };
> + crit-trip: crit-trip {
> + temperature = <68000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_CRITICAL>;
> + };
> + };
> + };
>
> Now writing the above example, one might want to have a way to say the
> sensor correlation equation.
>
> Another example:
> + #include <dt-bindings/thermal/thermal.h>
> +
> + dsp_thermal: dsp_thermal {
> + passive-delay = <250>; /* milliseconds */
> + polling-delay = <1000>; /* milliseconds */
> + /*
> + * sensor ID
> + */
> + sensors = <&bandgap0 1>;
> + /*
> + * cooling
> + * device Usage Trip lower
> bound upper bound
> + */
> + cooling-devices = <&dsp0 100 &dsp-alert
> THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
> + trips {
> + dsp-alert: dsp-alert {
> + temperature = <100000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_PASSIVE>;
> + };
> + dsp-crit: cdsp-crit {
> + temperature = <125000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_CRITICAL>;
> + };
> + };
> + };
> +
> + gpu_thermal: gpu_thermal {
> + passive-delay = <500>; /* milliseconds */
> + polling-delay = <1000>; /* milliseconds */
> + /*
> + * sensor ID
> + */
> + sensors = <&bandgap0 2>;
> + /*
> + * cooling
> + * device Usage Trip lower
> bound upper bound
> + */
> + cooling-devices = <&gpu0 100 &gpu-alert
> THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
> + trips {
> + gpu-alert: gpu-alert {
> + temperature = <100000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_PASSIVE>;
> + };
> + gpu-crit: gpu-crit {
> + temperature = <125000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_CRITICAL>;
> + };
> + };
> + };
>
>
> Wei, I think the above would cover for the IPs with multiple internal
> sensors cases you were looking for, right?
>
> Durga, please also check if I am missing something to cover for your
> plans to, in case you will be using DT to describe your HW.
>
> Anyways, Mark and Pawel, let me know if I missed something from our
> discussion. I believe the above bindings would look more like regular
> standard DT bindings. And also they should be now slim enough, without
> Linux implementation details and also without policies.

Any objections on the above binding proposal?

>
>>
>>
>
>


--
You have got to be excited about what you are doing. (L. Lamport)

Eduardo Valentin


Attachments:
signature.asc (295.00 B)
OpenPGP digital signature

2013-09-03 13:17:27

by Mark Rutland

[permalink] [raw]
Subject: Re: [RFC PATCH 02/14] drivers: thermal: introduce device tree parser

On Fri, Aug 30, 2013 at 12:19:43AM +0100, Eduardo Valentin wrote:
> Mark, Pawel and Stephen,
>
>
> On 27-08-2013 14:17, Eduardo Valentin wrote:
> > On 27-08-2013 12:23, Mark Rutland wrote:
> >> On Tue, Aug 27, 2013 at 02:44:40PM +0100, Eduardo Valentin wrote:
> >>> Hello Mark,

Hi, sorry for the delay.

>
> <cut>
>
> I believe now we need to align on thermal bindings, before I propose
> next version of this series. So, following the discussions on this
> thread, I believe a typical CPU thermal zone would look like the following:
> + #include <dt-bindings/thermal/thermal.h>
> +
> + cpu_thermal: cpu_thermal {
> + passive-delay = <250>; /* milliseconds */
> + polling-delay = <1000>; /* milliseconds */

I'm still not keen on placing these intervals in the dt. Why can the
kernel not choose sensible values?

> + /*
> + * sensor ID
> + */
> + sensors = <&bandgap0 0>;
> + /*
> + * cooling
> + * device Usage Trip lower
> bound upper bound
> + */
> + cooling-devices = <&cpu0 100 &cpu-alert
> THERMAL_NO_LIMITS THERMAL_NO_LIMITS>;

I'm not sure it makes sense to group this data within a property. We
might need to extend this in future and it might not be possible to
extend this unambiguously. It may make more sense as a sub-node or
collection of properties.

> + trips {
> + cpu-alert: cpu-alert {
> + temperature = <100000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_PASSIVE>;
> + };
> + cpu-crit: cpu-crit {
> + temperature = <125000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_CRITICAL>;
> + };
> + };
> + };

How do the upper and lower bounds in the cooling-devices property
interact with temperatures described in the trips node?

>
> *Note: THERMAL_NO_LIMIT means, it will be using the cooling device
> minimum and maximum limits.

I'm confused what are the cooling device min and max limits? Where are
they defined if not in the lower bound and upper bound entries of the
cooling-devices property?

>
> Couple of comments.
> 1. I am keeping the pooling intervals. A possible alternative way to
> describe that would be specifying the maximum dT/dt. Essentially because
> I am still convinced that this is part of hw specs.

I still don't see why it's not possible to do this dynamically. I am not
entirely convinced this is part of the hw spec.

>
> 2. The above follows your suggestion to use consumer pointing to
> producers. Although, I still need to figure out how this could be
> implemented for Linux. But I think that is another story, at least for
> now. We need to first align on the DT binding itself.

I assume you mean the sensors property? That looks fine to me, though we
may need to specify the way sensors relate to temperatures in trip
points and cooling device properties -- do those temperatures correspond
to an average of sensors readings, min/max, etc?

>
> 3. The link from cooling device specification to trip points, from my
> perspective, should be enough, and thus we shall not need the thermal
> cells and size for trip points, as you proposed. Let me know what you think.

I'm not keen on the mixed bag of values in the cooling-device property,
as it's difficult to extend in future if necessary. I think we may need
cells for cooling devices -- what if a single fan controller controls
fans in different thermal zones?

I'm also not keen on devices being described as their own cooling
device, as I think this is more difficult to support than not listing a
cooling device -- it'll have to be special-cased anyway.

> Below is another example, on a more complex scenario, board specific
> case (hypothetical, just for exemplification):
>
> + #include <dt-bindings/thermal/thermal.h>
> +
> + board_thermal: board_thermal {
> + passive-delay = <1000>; /* milliseconds */
> + polling-delay = <2500>; /* milliseconds */
> + /*
> + * sensor ID
> + */
> + sensors = <&adc-dummy 0>,
> + <&adc-dummy 1>,
> + <&adc-dymmy 2>;

Here we have a set of sensors, but no relationship between these sensors
and the trips or cooling-devices are described. How does that work?

> + /*
> + * cooling
> + * device Usage Trip lower upper
> + */
> + cooling-devices = <&cpu0 75 &cpu-trip 0 2>,
> + <&gpu0 40 &gpu-trip 0 2>;
> + <&lcd0 25 &lcd-trip 5 10>;
> + trips {
> + cpu-trip: cpu-trip {
> + temperature = <60000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_PASSIVE>;
> + };
> + gpu-trip: gpu-trip {
> + temperature = <55000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_PASSIVE>;
> + }
> + lcd-trip: lcp-trip {
> + temperature = <53000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_PASSIVE>;
> + };
> + crit-trip: crit-trip {
> + temperature = <68000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_CRITICAL>;
> + };
> + };
> + };
>
> Now writing the above example, one might want to have a way to say the
> sensor correlation equation.
>
> Another example:
> + #include <dt-bindings/thermal/thermal.h>
> +
> + dsp_thermal: dsp_thermal {
> + passive-delay = <250>; /* milliseconds */
> + polling-delay = <1000>; /* milliseconds */
> + /*
> + * sensor ID
> + */
> + sensors = <&bandgap0 1>;
> + /*
> + * cooling
> + * device Usage Trip lower
> bound upper bound
> + */
> + cooling-devices = <&dsp0 100 &dsp-alert
> THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
> + trips {
> + dsp-alert: dsp-alert {
> + temperature = <100000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_PASSIVE>;
> + };
> + dsp-crit: cdsp-crit {
> + temperature = <125000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_CRITICAL>;
> + };
> + };
> + };
> +
> + gpu_thermal: gpu_thermal {
> + passive-delay = <500>; /* milliseconds */
> + polling-delay = <1000>; /* milliseconds */
> + /*
> + * sensor ID
> + */
> + sensors = <&bandgap0 2>;
> + /*
> + * cooling
> + * device Usage Trip lower
> bound upper bound
> + */
> + cooling-devices = <&gpu0 100 &gpu-alert
> THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
> + trips {
> + gpu-alert: gpu-alert {
> + temperature = <100000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_PASSIVE>;
> + };
> + gpu-crit: gpu-crit {
> + temperature = <125000>; /* milliCelsius */
> + hysteresis = <2000>; /* milliCelsius */
> + type = <THERMAL_TRIP_CRITICAL>;
> + };
> + };
> + };
>
>
> Wei, I think the above would cover for the IPs with multiple internal
> sensors cases you were looking for, right?
>
> Durga, please also check if I am missing something to cover for your
> plans to, in case you will be using DT to describe your HW.
>
> Anyways, Mark and Pawel, let me know if I missed something from our
> discussion. I believe the above bindings would look more like regular
> standard DT bindings. And also they should be now slim enough, without
> Linux implementation details and also without policies.

I think that the above can describe that, but I'd like to see a binding
document so we can consider it in more detail.

Thanks,
Mark.

2013-09-03 17:13:51

by Eduardo Valentin

[permalink] [raw]
Subject: Re: [RFC PATCH 02/14] drivers: thermal: introduce device tree parser

Hi Mark,

On 03-09-2013 09:15, Mark Rutland wrote:
> On Fri, Aug 30, 2013 at 12:19:43AM +0100, Eduardo Valentin wrote:
>> Mark, Pawel and Stephen,
>>
>>
>> On 27-08-2013 14:17, Eduardo Valentin wrote:
>>> On 27-08-2013 12:23, Mark Rutland wrote:
>>>> On Tue, Aug 27, 2013 at 02:44:40PM +0100, Eduardo Valentin wrote:
>>>>> Hello Mark,
>
> Hi, sorry for the delay.
>

No issues,

>>
>> <cut>
>>
>> I believe now we need to align on thermal bindings, before I propose
>> next version of this series. So, following the discussions on this
>> thread, I believe a typical CPU thermal zone would look like the following:
>> + #include <dt-bindings/thermal/thermal.h>
>> +
>> + cpu_thermal: cpu_thermal {
>> + passive-delay = <250>; /* milliseconds */
>> + polling-delay = <1000>; /* milliseconds */
>
> I'm still not keen on placing these intervals in the dt. Why can the
> kernel not choose sensible values?
>
>> + /*
>> + * sensor ID
>> + */
>> + sensors = <&bandgap0 0>;
>> + /*
>> + * cooling
>> + * device Usage Trip lower
>> bound upper bound
>> + */
>> + cooling-devices = <&cpu0 100 &cpu-alert
>> THERMAL_NO_LIMITS THERMAL_NO_LIMITS>;
>
> I'm not sure it makes sense to group this data within a property. We
> might need to extend this in future and it might not be possible to
> extend this unambiguously. It may make more sense as a sub-node or
> collection of properties.
>

I see. Something like the previous DT binding? My previous proposal was
using sub-nodes.

>> + trips {
>> + cpu-alert: cpu-alert {
>> + temperature = <100000>; /* milliCelsius */
>> + hysteresis = <2000>; /* milliCelsius */
>> + type = <THERMAL_TRIP_PASSIVE>;
>> + };
>> + cpu-crit: cpu-crit {
>> + temperature = <125000>; /* milliCelsius */
>> + hysteresis = <2000>; /* milliCelsius */
>> + type = <THERMAL_TRIP_CRITICAL>;
>> + };
>> + };
>> + };
>
> How do the upper and lower bounds in the cooling-devices property
> interact with temperatures described in the trips node?

By means of the trip phandle. It would mean once the trip is crossed, it
is expected to modulate the cooling device between the specified limits.

>
>>
>> *Note: THERMAL_NO_LIMIT means, it will be using the cooling device
>> minimum and maximum limits.
>
> I'm confused what are the cooling device min and max limits? Where are
> they defined if not in the lower bound and upper bound entries of the
> cooling-devices property?

They were not specified in this example, because previous discussions
suggested that we shall not represent cooling devices as dt nodes as
they are virtual. Those would be part of the cooling device node, in the
case we want to have cooling devices nodes and property in DT.

>
>>
>> Couple of comments.
>> 1. I am keeping the pooling intervals. A possible alternative way to
>> describe that would be specifying the maximum dT/dt. Essentially because
>> I am still convinced that this is part of hw specs.
>
> I still don't see why it's not possible to do this dynamically. I am not
> entirely convinced this is part of the hw spec.

Because not all hardware have power sensors and deriving power vs.
temperature vs. time is not a simple task, specially based on estimates,
say on cpu load. Remember that we are not only talking about CPU
temperature here, there are other heat sources. Same load has different
consumption on different HW, how SW is supposed to find that out wihtout
proper sensing?

>
>>
>> 2. The above follows your suggestion to use consumer pointing to
>> producers. Although, I still need to figure out how this could be
>> implemented for Linux. But I think that is another story, at least for
>> now. We need to first align on the DT binding itself.
>
> I assume you mean the sensors property? That looks fine to me, though we

Sensors and cooling-devices.

> may need to specify the way sensors relate to temperatures in trip
> points and cooling device properties -- do those temperatures correspond
> to an average of sensors readings, min/max, etc?

Those are based on instant measurements.

>
>>
>> 3. The link from cooling device specification to trip points, from my
>> perspective, should be enough, and thus we shall not need the thermal
>> cells and size for trip points, as you proposed. Let me know what you think.
>
> I'm not keen on the mixed bag of values in the cooling-device property,
> as it's difficult to extend in future if necessary. I think we may need
> cells for cooling devices -- what if a single fan controller controls
> fans in different thermal zones?

I believe it is just a matter of representation. Using either a list or
subnodes would imply that one would need to add links to fan device
twice, one in each thermal zone (it does not matter if inside a subnode
or inside a list).

>
> I'm also not keen on devices being described as their own cooling
> device, as I think this is more difficult to support than not listing a
> cooling device -- it'll have to be special-cased anyway.

Then we need to have cooling device nodes, even if they are virtual.

>
>> Below is another example, on a more complex scenario, board specific
>> case (hypothetical, just for exemplification):
>>
>> + #include <dt-bindings/thermal/thermal.h>
>> +
>> + board_thermal: board_thermal {
>> + passive-delay = <1000>; /* milliseconds */
>> + polling-delay = <2500>; /* milliseconds */
>> + /*
>> + * sensor ID
>> + */
>> + sensors = <&adc-dummy 0>,
>> + <&adc-dummy 1>,
>> + <&adc-dymmy 2>;
>
> Here we have a set of sensors, but no relationship between these sensors
> and the trips or cooling-devices are described. How does that work?

See my query below.

>
>> + /*
>> + * cooling
>> + * device Usage Trip lower upper
>> + */
>> + cooling-devices = <&cpu0 75 &cpu-trip 0 2>,
>> + <&gpu0 40 &gpu-trip 0 2>;
>> + <&lcd0 25 &lcd-trip 5 10>;
>> + trips {
>> + cpu-trip: cpu-trip {
>> + temperature = <60000>; /* milliCelsius */
>> + hysteresis = <2000>; /* milliCelsius */
>> + type = <THERMAL_TRIP_PASSIVE>;
>> + };
>> + gpu-trip: gpu-trip {
>> + temperature = <55000>; /* milliCelsius */
>> + hysteresis = <2000>; /* milliCelsius */
>> + type = <THERMAL_TRIP_PASSIVE>;
>> + }
>> + lcd-trip: lcp-trip {
>> + temperature = <53000>; /* milliCelsius */
>> + hysteresis = <2000>; /* milliCelsius */
>> + type = <THERMAL_TRIP_PASSIVE>;
>> + };
>> + crit-trip: crit-trip {
>> + temperature = <68000>; /* milliCelsius */
>> + hysteresis = <2000>; /* milliCelsius */
>> + type = <THERMAL_TRIP_CRITICAL>;
>> + };
>> + };
>> + };
>>
>> Now writing the above example, one might want to have a way to say the
>> sensor correlation equation.

One way to go is to add a property to describe the linear relation
between them.

>>
>> Another example:
>> + #include <dt-bindings/thermal/thermal.h>
>> +
>> + dsp_thermal: dsp_thermal {
>> + passive-delay = <250>; /* milliseconds */
>> + polling-delay = <1000>; /* milliseconds */
>> + /*
>> + * sensor ID
>> + */
>> + sensors = <&bandgap0 1>;
>> + /*
>> + * cooling
>> + * device Usage Trip lower
>> bound upper bound
>> + */
>> + cooling-devices = <&dsp0 100 &dsp-alert
>> THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
>> + trips {
>> + dsp-alert: dsp-alert {
>> + temperature = <100000>; /* milliCelsius */
>> + hysteresis = <2000>; /* milliCelsius */
>> + type = <THERMAL_TRIP_PASSIVE>;
>> + };
>> + dsp-crit: cdsp-crit {
>> + temperature = <125000>; /* milliCelsius */
>> + hysteresis = <2000>; /* milliCelsius */
>> + type = <THERMAL_TRIP_CRITICAL>;
>> + };
>> + };
>> + };
>> +
>> + gpu_thermal: gpu_thermal {
>> + passive-delay = <500>; /* milliseconds */
>> + polling-delay = <1000>; /* milliseconds */
>> + /*
>> + * sensor ID
>> + */
>> + sensors = <&bandgap0 2>;
>> + /*
>> + * cooling
>> + * device Usage Trip lower
>> bound upper bound
>> + */
>> + cooling-devices = <&gpu0 100 &gpu-alert
>> THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
>> + trips {
>> + gpu-alert: gpu-alert {
>> + temperature = <100000>; /* milliCelsius */
>> + hysteresis = <2000>; /* milliCelsius */
>> + type = <THERMAL_TRIP_PASSIVE>;
>> + };
>> + gpu-crit: gpu-crit {
>> + temperature = <125000>; /* milliCelsius */
>> + hysteresis = <2000>; /* milliCelsius */
>> + type = <THERMAL_TRIP_CRITICAL>;
>> + };
>> + };
>> + };
>>
>>
>> Wei, I think the above would cover for the IPs with multiple internal
>> sensors cases you were looking for, right?
>>
>> Durga, please also check if I am missing something to cover for your
>> plans to, in case you will be using DT to describe your HW.
>>
>> Anyways, Mark and Pawel, let me know if I missed something from our
>> discussion. I believe the above bindings would look more like regular
>> standard DT bindings. And also they should be now slim enough, without
>> Linux implementation details and also without policies.
>
> I think that the above can describe that, but I'd like to see a binding
> document so we can consider it in more detail.

Well, I can write another RFC based on this discussion, for sure. I just
want to know what really blocks this work, that is why I sent the
examples above. But it does not seam we are progressing, looks like we
are move discussing in circles.

Cooling devices seams to be bouncing back and forward. I can send an RFC
considering:
i) addition of cooling devices virtual device nodes
ii) description of used cooling devices inside thermal zones by means of
sub-nodes, instead of lists

Pooling intervals seams to be also a loose lace. To me it is a matter of
choosing a good way to avoid missing events. We can either get the info
from HW simulation / experimentation based on HW itself or we can rely
on SW and assume SW can handle to see all events based on loose info
(assume that sw can be smart to do a self adaptive closed loop control
system based on no defined set of inputs). Both solutions are doable,
except that one may be closer to reality than the other. Also, one is
based on HW info and the other forces sw to deduce HW info. If you can
provide info that your proposal has strong evidence to be closer to
reality, I am find to signed it off.

All best,


Eduardo.
>
> Thanks,
> Mark.
>
>


--
You have got to be excited about what you are doing. (L. Lamport)

Eduardo Valentin


Attachments:
signature.asc (295.00 B)
OpenPGP digital signature

2013-09-07 00:20:55

by Eduardo Valentin

[permalink] [raw]
Subject: Re: [RFC PATCH 02/14] drivers: thermal: introduce device tree parser

Hi Mark, Stephen and Pawel,

On 03-09-2013 09:15, Mark Rutland wrote:
> On Fri, Aug 30, 2013 at 12:19:43AM +0100, Eduardo Valentin wrote:

<cut>

> I think that the above can describe that, but I'd like to see a binding
> document so we can consider it in more detail.

Find below another proposal. It is the updated binding document, with
the your comments applied (at least those I agree :-) ). It is obviously
an work in progress, but I think it is getting closer to what we are
trying to achieve, I believe. And of course, much better after using
your suggestions.

As I stated before, I believe it is crucial to first agree on the
bindings, then I can go ahead and update the corresponding code.

The change from the last binding examples I sent is basically on sensors
and cooling devices. This time, as suggested by Mark, I am adding
cooling device nodes (or at least properties to be embedded into
existing nodes). At some point, I remember that Pawel was not so in
favor on this type of node, but lets discuss on top of the document
below. I also added the #cells properties, as needed.

Hopefully we may end with an agreement. :-)

So, the document would look like this:

-----------------------------------------------------------------------
* Thermal Framework Device Tree descriptor

Generic binding to provide a way of defining hardware thermal
structure using device tree. A thermal structure includes thermal
zones and their components, such as trip points, polling intervals,
sensors and cooling devices binding descriptors.

The target of device tree thermal descriptors is to describe only
the hardware thermal aspects, not how the system must control or which
algorithm or policy must be taken in place.

There are five types of nodes involved to describe thermal bindings:
- sensors: used to describe the device source of temperature sensing;
- cooling devices: used to describe devices source of power dissipation
control;
- trip points: used to describe points in temperature domain defined to
make the system aware of hardware limits;
- cooling attachments: used to describe links between trip points and
cooling devices;
- thermal zones: used to describe thermal data within the hardware;

It follows a description of each type of these device tree nodes.

* Sensor devices

Sensor devices are nodes providing temperature sensing capabilities on
thermal
zones. Typical devices are I2C ADC converters and bandgaps. Theses are nodes
providing temperature data to thermal zones. Temperature sensor devices may
control one or more internal sensors.

Required property:
- #sensor-cells: Used to provide sensor device specific information
while referring to it. Must be at least 1, in order
to identify uniquely the sensor instances within
the IC. See thermal zone binding for more details
on how consumers refer to sensor devices.

* Cooling device nodes

Cooling devices are nodes providing control on power dissipation. There
are essentially two ways to provide control on power dissipation. First
is by means of regulating device performance, which is known as passive
cooling. Second is by means of activating devices in order to remove
the dissipated heat, which is known as active cooling, e.g. regulating
fan speeds. In both cases, cooling devices shall have a way to determine
the level of cooling.

Required property:
- cooling-min-level: A unsigned integer indicating the smallest
cooling level accepted. Typically 0.
- cooling-max-level: An unsigned integer indicating the largest
cooling level accepted.
- #cooling-cells: Used to provide cooling device specific information
while referring to it. Must be at least 2, in order
to specify minimum and maximum cooling level used
in the reference. See Cooling device attachments section
below for more details on how consumers refer to
cooling devices.

* Trip points

The trip node is a node to describe a point in the temperature domain
in which the system takes an action. This node describes just the point,
not the action.

Required properties:
- temperature: the trip temperature level, in milliCelsius.
- hysteresis: a (low) hysteresis value on 'temperature'. This is a
relative value, in milliCelsius.
- type: the trip type. Here is the type mapping:
THERMAL_TRIP_ACTIVE 0: A trip point to enable active cooling
THERMAL_TRIP_PASSIVE 1: A trip point to enable passive cooling
THERMAL_TRIP_HOT 2: A trip point to notify emergency
THERMAL_TRIP_CRITICAL 3: Hardware not reliable.

Refer to include/dt-bindings/thermal/thermal.h for definition of these
consts.

* Cooling device attachments

The cooling device attachments node is a node to describe how cooling
devices
get assigned to trip points of the zone. The cooling devices are expected
to be loaded in the target system.

Required properties:
- cooling-device: A phandle of a cooling device with its parameters,
referring to which cooling device is used in this
binding. The required parameters are: the minimum
cooling level and the maximum cooling level used
in this attach.
- trip: A phandle of a trip point node within the same thermal
zone.

Optional property:
- contribution: The cooling contribution to the thermal zone of the
referred cooling device at the referred trip point.
The contribution is a value from 0 to 100. The sum
of all cooling contributions within a thermal zone
must never exceed 100.

Note: Using the THERMAL_NO_LIMIT (-1L) constant in the cooling-device
phandle
limit parameters means:
(i) - minimum level allowed for minimum cooling level used in the
reference.
(ii) - maximum level allowed for maximum cooling level used in the
reference.
Refer to include/dt-bindings/thermal/thermal.h for definition of this
constant.

* Thermal zones

The thermal-zone node is the node containing all the required info
for describing a thermal zone, including its cdev bindings. The thermal_zone
node must contain, apart from its own properties, one node containing
trip nodes and one node containing all the zone cooling attachments.

Required properties:
- passive-delay: The maximum number of milliseconds to wait between polls
when performing passive cooling.
- polling-delay: The maximum number of milliseconds to wait between polls
when checking this thermal zone.
- sensors: A list of sensor phandles and their parameters. The
required parameter is the sensor id, in order to
identify internal sensors when the sensor IC features
several sensing units.
- trips: A sub-node containing several trip point nodes required
to describe the thermal zone.
- cooling-attachments A sub-node containing several cooling device attaches
nodes, used to describe the relation between trips
and cooling devices.

Optional property:
- coefficients: An array of integers (one signed cell) containing
coefficients to compose a linear relation between
the sensors described in the sensors property.
Coefficients defaults to 1, in case this property
is not specified. A simple linear polynomial is used:
Z = c0 * x0 + c1 + x1 + ... + c(n-1) * x(n-1) + cn.

The coefficients are ordered and they match with sensors
by means of sensor ID. Additional coefficients are
interpreted as constant offsets.

Note: The delay properties are bound to the maximum dT/dt (temperature
derivative over time) in two situations for a thermal zone:
(i) - when active cooling is activated (passive-delay); and
(ii) - when the zone just needs to be monitored (polling-delay).
The maximum dT/dt is highly bound to hardware power consumption and
dissipation
capability.

* Examples

Below are several examples on how to use thermal data descriptors
using device tree bindings:

(a) - CPU thermal zone

The CPU thermal zone example below describes how to setup one thermal zone
using one single sensor as temperature source and many cooling devices and
power dissipation control sources.

#include <dt-bindings/thermal/thermal.h>

cpus {
cpu0: cpu@0 {
...
cooling-min-level = 0;
cooling-max-level = 3;
#cooling-cells = <2>; /* min followed by max */
};
...
};

&i2c1 {
...
fan0: fan@0x48 {
...
cooling-min-level = 0;
cooling-max-level = 9;
#cooling-cells = <2>; /* min followed by max */
};
};

bandgap0: bandgap@0x0000ED00 {
...
#sensor-cells = <1>;
};

cpu-thermal: cpu-thermal {
passive-delay = <250>; /* milliseconds */
polling-delay = <1000>; /* milliseconds */

/* sensor ID */
sensors = <&bandgap0 0>;

trips {
cpu-alert0: cpu-alert {
temperature = <90000>; /* milliCelsius */
hysteresis = <2000>; /* milliCelsius */
type = <THERMAL_TRIP_ACTIVE>;
};
cpu-alert1: cpu-alert {
temperature = <100000>; /* milliCelsius */
hysteresis = <2000>; /* milliCelsius */
type = <THERMAL_TRIP_PASSIVE>;
};
cpu-crit: cpu-crit {
temperature = <125000>; /* milliCelsius */
hysteresis = <2000>; /* milliCelsius */
type = <THERMAL_TRIP_CRITICAL>;
};
};

cooling-attachments {
attach0 {
trip = <&cpu-alert0>;
cooling-devices = <&fan0 THERMAL_NO_LIMITS 4>;
};
attach1 {
trip = <&cpu-alert1>;
cooling-device = <&fan0 5 THERMAL_NO_LIMITS>;
};
attach2 {
trip = <&cpu-alert1>;
cooling-device =
<&cpu0 THERMAL_NO_LIMITS THERMAL_NO_LIMITS>;
};
};
};

In the example above, the ADC sensor at address 0x0000ED00 is used to
monitor
the zone 'cpu-thermal' using its the sensor 0. The fan0, a fan device
controlled
via I2C bus 1, at adress 0x48, is used to remove the heat out of the thermal
zone 'cpu-thermal' using its cooling levels from its minimum to 4, when it
reaches trip point 'cpu-alert0' at 90C, as an example of active cooling. The
same cooling device is used at 'cpu-alert1', but from 5 to its maximum
level.
The cpu@0 device is also linked to the same thermal zone, 'cpu-thermal',
as a
passive cooling device, using all its cooling levels at trip point
'cpu-alert1',
which is a trip point at 100C.

(b) - IC with several internal sensors

The example below describes how to deploy several thermal zones based off a
single sensor IC, assuming it has several internal sensors. This is a common
case on SoC designs with several internal IPs that may need different
thermal
requirements, and thus may have their own sensor to monitor or detect
internal
hotspots in their silicon.

#include <dt-bindings/thermal/thermal.h>

bandgap0: bandgap@0x0000ED00 {
...
#sensor-cells = <1>;
};

cpu-thermal: cpu-thermal {
passive-delay = <250>; /* milliseconds */
polling-delay = <1000>; /* milliseconds */

/* sensor ID */
sensors = <&bandgap0 0>;

trips {
/* each zone within the SoC may have its own trips */
cpu-alert: cpu-alert {
temperature = <100000>; /* milliCelsius */
hysteresis = <2000>; /* milliCelsius */
type = <THERMAL_TRIP_PASSIVE>;
};
cpu-crit: cpu-crit {
temperature = <125000>; /* milliCelsius */
hysteresis = <2000>; /* milliCelsius */
type = <THERMAL_TRIP_CRITICAL>;
};
};

cooling-attachments {
/* each zone within the SoC may have its own cooling */
...
};
};

gpu-thermal: gpu-thermal {
passive-delay = <120>; /* milliseconds */
polling-delay = <1000>; /* milliseconds */

/* sensor ID */
sensors = <&bandgap0 1>;

trips {
/* each zone within the SoC may have its own trips */
gpu-alert: gpu-alert {
temperature = <90000>; /* milliCelsius */
hysteresis = <2000>; /* milliCelsius */
type = <THERMAL_TRIP_PASSIVE>;
};
gpu-crit: gpu-crit {
temperature = <105000>; /* milliCelsius */
hysteresis = <2000>; /* milliCelsius */
type = <THERMAL_TRIP_CRITICAL>;
};
};

cooling-attachments {
/* each zone within the SoC may have its own cooling */
...
};
};

dsp-thermal: dsp-thermal {
passive-delay = <50>; /* milliseconds */
polling-delay = <1000>; /* milliseconds */

/* sensor ID */
sensors = <&bandgap0 2>;

trips {
/* each zone within the SoC may have its own trips */
dsp-alert: gpu-alert {
temperature = <90000>; /* milliCelsius */
hysteresis = <2000>; /* milliCelsius */
type = <THERMAL_TRIP_PASSIVE>;
};
dsp-crit: gpu-crit {
temperature = <135000>; /* milliCelsius */
hysteresis = <2000>; /* milliCelsius */
type = <THERMAL_TRIP_CRITICAL>;
};
};

cooling-attachments {
/* each zone within the SoC may have its own cooling */
...
};
};

In the example above there is one bandgap IC which has the capability to
monitor three sensors. The hardware has been designed so that sensors are
placed on different places in the DIE to monitor different temperature
hotspots: one for CPU thermal zone, one for GPU thermal zone and the
other to monitor a DSP thermal zone.

Thus, there is a need to assign each sensor provided by the bandgap IC
to different thermal zones. This is achieved by means of using the
#sensor-cells property and using the first parameter as sensor ID.
In the example, then, bandgap.sensor0 is used to monitor CPU thermal zone,
bandgap.sensor1 is used to monitor GPU thermal zone and bandgap.sensor2
is used to monitor DSP thermal zone. Each zone may be uncorrelated,
having its own dT/dt requirements, trips and cooling attachments.

(c) - Several sensors within one single thermal zone

The example below illustrates how to use more than one sensor within
one thermal zone.

#include <dt-bindings/thermal/thermal.h>

&i2c1 {
...
adc: sensor@0x49 {
...
#sensor-cells = <1>;
};
};

bandgap0: bandgap@0x0000ED00 {
...
#sensor-cells = <1>;
};

cpu-thermal: cpu-thermal {
passive-delay = <250>; /* milliseconds */
polling-delay = <1000>; /* milliseconds */

/* sensor ID */
sensors = <&bandgap0 0>,
<&adc 0>;

/* hotspot = 100 * bandgap - 120 * adc + 484 */
coefficients = <100 -120 484>;

trips {
...
};

cooling-attachments {
...
};
};

In some cases, there is a need to use more than one sensor to extrapolate
a thermal hotspot in the silicon. The above example illustrate this
situation.
For instance, it may be the case that a sensor external to CPU IP may be
place
close to CPU hotspot and together with internal CPU sensor, it is used
to determine the hotspot. The hyppotetical extrapolation rule would be:
hotspot = 100 * bandgap - 120 * adc + 484

The same idea can be used to add fixed offset:
passive-delay = <1000>; /* milliseconds */
polling-delay = <2500>; /* milliseconds */
hotspot = 1 * adc + 6000

In the above equation, the hotspot is always 6C higher than what is read
from the sensor ADC. The binding would be then:
/* sensor ID */
sensors = <&adc 0>;

/* hotspot = 1 * adc + 6000 */
coefficients = <1 6000>;

(d) - Board thermal

The board thermal example below illustrates how to setup one thermal zone
with many sensors and many cooling devices.

#include <dt-bindings/thermal/thermal.h>

&i2c1 {
...
adc-dummy: sensor@0x50 {
...
#sensor-cells = <1>; /* sensor internal ID */
};
};

batt-thermal {
passive-delay = <500>; /* milliseconds */
polling-delay = <2500>; /* milliseconds */

/* sensor ID */
sensors = <&adc-dummy 4>;

trips {
...
};

cooling-attachments {
...
};
};

board-thermal: board-thermal {
passive-delay = <1000>; /* milliseconds */
polling-delay = <2500>; /* milliseconds */

/* sensor ID */
sensors = <&adc-dummy 0>,
<&adc-dummy 1>,
<&adc-dymmy 2>;
/*
* An array of coefficients describing the sensor
* linear relation. E.g.:
* z = c1*x1 + c2*x2 + c3*x3
*/
coefficients = <1200 -345 890>;

trips {
/* Trips are based on resulting linear equation */
cpu-trip: cpu-trip {
temperature = <60000>; /* milliCelsius */
hysteresis = <2000>; /* milliCelsius */
type = <THERMAL_TRIP_PASSIVE>;
};
gpu-trip: gpu-trip {
temperature = <55000>; /* milliCelsius */
hysteresis = <2000>; /* milliCelsius */
type = <THERMAL_TRIP_PASSIVE>;
}
lcd-trip: lcp-trip {
temperature = <53000>; /* milliCelsius */
hysteresis = <2000>; /* milliCelsius */
type = <THERMAL_TRIP_PASSIVE>;
};
crit-trip: crit-trip {
temperature = <68000>; /* milliCelsius */
hysteresis = <2000>; /* milliCelsius */
type = <THERMAL_TRIP_CRITICAL>;
};
};

cooling-attachments {
attach0 {
trip = <&cpu-trip>;
cooling-device = <&cpu0 0 2>;
contribution = <55>;
};
attach1 {
trip = <&gpu-trip>;
cooling-device = <&gpu0 0 2>;
contribution = <20>;
};
attach2 {
trip = <&lcd-trip>;
cooling-device = <&lcd0 5 10>;
contribution = <15>;
};
};
};

The above example is a mix of previous examples, a sensor IP with
several internal
sensors used to monitor different zones, one of them is composed by
several sensors and
with different cooling devices.
-----------------------------------------------------------------------

All best,


--
You have got to be excited about what you are doing. (L. Lamport)

Eduardo Valentin


Attachments:
signature.asc (295.00 B)
OpenPGP digital signature