2013-07-22 14:26:48

by Eduardo Valentin

[permalink] [raw]
Subject: RFC: device thermal limits represented in device tree nodes

Hello Grant and Rob,

(Resending, as I got a message saying:
<[email protected]>: Recipient address rejected: User
has moved to devicetree at vger.kernel.org)

I am writing this email to you specifically to ask your technical
assessment with respect to representing device thermal limits as device
tree nodes. I am proposing to introduce device tree nodes to describe
these limits as thermal zones, their composition and their relations
with cooling devices and other thermal zones (thermal data).

As you should know, device thermal limits are part of hardware
specification. Considering your board layout, mechanics, power
dissipation and composition of ICs, etc, that will impose thermal
requirements on your system, and infringing these limits can lead to
device damage, device life time reduction or even end user harm. Thus,
the thermal data help to describe the hardware limits and what needs to
be done if those limits are crosses, as part of your board design and
non-functional requirements. Obviously that is very dependent on your
hardware, and not all of them will have these non-functional
requirements. Besides, describing these limits has *nothing* to do with
how you actually find these limits.

In any case, there is a need to properly represent these requirements
and I am proposing to have this representation in device tree. There
were already couple of counter-arguments claiming this is actually about
configuration and performance profile description. But I still stand
against these two readings of this proposal and again state that if one
interprets it as configuration or performance profile, that is a
mis-understanding [0]. Let me state it clear (again [1]), my proposal is
to describe hardware thermal limits, because these limits are part of a
hardware specification; representing in device tree would not infringe
the original purpose of this data structure ("The Device Tree is a data
structure for describing hardware."[2]).

Before I explain my proposal, I want to highlight also that these data
is represented elsewhere already and it is reused across different OS's.
Thermal data is described using ACPI [3] and operating systems
ACPI-aware do support the interpretation of thermal data. Linux is one
example of such systems (I believe I do not need to enlist here all
systems supporting ACPI). On the other hand, not all systems have ACPI
or are specified to use ACPI. Thus, here is another reason to represent
properly thermal data, so that we can scale across systems.

In the specific case of Linux, the common thermal concepts between ACPI
systems and non-ACPI systems have been represented in the thermal
framework (CONFIG_THERMAL). Today, on ACPI systems, thermal data is
fetched from bootloader with help from the common ACPI parser. For
non-ACPI systems, the thermal data is actually coded as part of device
drivers.

So, to the point, a brief explanation of my proposal goes as follows:
i - trip points: a node to describe a point in the temperature domain
in which the system has to take an action. This node describes just the
point, not the action. Properties here are temperature, hysteresis, and
type (critical, hot, passive, active, etc).
ii - binding parameters: the bind_param node is a node to describe how
actions (cooling devices) get assigned to trip points. Cooling devices
are expected to be loaded in the target system. Properties here are:
cooling device name, weight, trip_mask and limits.
iii - thermal zones: the thermal_zone node is the node containing all
the required info for describing a thermal zone with hardware thermal
limitation, including its bindings with cooling devices. Properties here
are: type, passive_delay, polling_delay, governor. The thermal_zone
node must contain, apart from its own properties, one node containing
trip nodes and one node containing all the zone bind parameters.

Here is an example (on OMAP4430):
thermal_zone {
type = "CPU";
mask = <0x03>; /* trips writability */
passive_delay = <250>; /* milliseconds */
polling_delay = <1000>; /* milliseconds */
governor = "step_wise";
trips {
alert@100000{
temperature = <100000>; /* milliCelsius
hysteresis = <2000>; /* milliCelsius */
type = <THERMAL_TRIP_PASSIVE>;
};
crit@125000{
temperature = <125000>; /* milliCelsius
hysteresis = <2000>; /* milliCelsius */
type = <THERMAL_TRIP_CRITICAL>;
};
};
bind_params {
action@0{
cooling_device = "thermal-cpufreq";
weight = <100>; /* percentage */
mask = <0x01>;
/* no limits, using defaults */
};
};
};

In this current proposal, a 'thermal_zone' node would be embedded inside
a temperature sensor node, for simplicity. But other possible builds
could embedded them in the device with thermal limits (CPU nodes, for
instance) or they could be not embedded in any specific node.

A full documented description can be found here [4]. Also a branch
containing:
(a) needed changes in order to have this DT parser;
(b) the DT parser with documentation
(c) examples on how drivers could be changes to use the parser
can be found in my branch here [5]. I wrote the thermal DT parser to
build thermal zones with the thermal framework API. However, if one does
not want to do that, it can simple do not include a CONFIG_THERMAL_OF=y
in her/his build, and the calls will be translated to nops, and the
device tree thermal data can be parsed to somewhere else interested
(other subsystem or even user land). A TODO on this implementation is
that it still lacks the representation of thermal zones composed by
several sensors. However, I believe it is better to take an incremental
approach here. This series can already be used to improve most of the
existing platform thermal drivers (most are CPU thermal drivers) and to
reuse the existing code of some hwmon sensors to build thermal zones for
board thermal requirements.

I have already posted a patch series with this proposal on [6], that
contains a reference for the original RFC. But looks like my messages
got moderated on device tree mailing list. Obviously, within PM forum,
feedback was quite positive. However, we cannot proceed without proper
assessment of other subsystems. lm-sensors folks (Guenter) seam to be
strongly against this series, as there is a fear that this may introduce
a mis-usage of DT. I still believe this is needed for hardware
description, and thus not a infringement on DT purposes.

Please let me know your thoughts on this topic and apologize me if my
previous messages on this topic did not reach you (hope they reach now).

All best,

Eduardo Valentin

[0] - https://lkml.org/lkml/2013/7/17/621
[1] - https://lkml.org/lkml/2013/7/18/279
[2] - http://www.devicetree.org
[3] - http://www.acpi.info/
[4] -
https://git.kernel.org/cgit/linux/kernel/git/evalenti/linux.git/diff/Documentation/devicetree/bindings/thermal/thermal.txt?h=thermal_work/thermal_core/dt_parser&id=405bf0b51457ed055a082af2653d7ce757bc2e91
[5] -
https://git.kernel.org/cgit/linux/kernel/git/evalenti/linux.git/log/?h=thermal_work/thermal_core/dt_parser
[6] - https://lkml.org/lkml/2013/7/17/923


--
You have got to be excited about what you are doing. (L. Lamport)

Eduardo Valentin




Attachments:
signature.asc (295.00 B)
OpenPGP digital signature

2013-07-24 01:44:47

by Stephen Warren

[permalink] [raw]
Subject: Re: RFC: device thermal limits represented in device tree nodes

On 07/22/2013 07:25 AM, Eduardo Valentin wrote:
> Hello Grant and Rob,
>
> (Resending, as I got a message saying:
> <[email protected]>: Recipient address rejected:
> User has moved to devicetree at vger.kernel.org)
>
> I am writing this email to you specifically to ask your technical
> assessment with respect to representing device thermal limits as
> device tree nodes. I am proposing to introduce device tree nodes to
> describe these limits as thermal zones, their composition and their
> relations with cooling devices and other thermal zones (thermal
> data).

Given:
https://lkml.org/lkml/2013/7/20/69
[PATCH 3/3] MAINTAINERS: Refactor device tree maintainership

I'm explicitly CCing a few people besides Grant/Rob, and qouting the
whole email.

>From my perspective, the concept of including thermal limits in DT
seems reasonable, although I haven't looked at the proposed binding
itself in detail yet.

> As you should know, device thermal limits are part of hardware
> specification. Considering your board layout, mechanics, power
> dissipation and composition of ICs, etc, that will impose thermal
> requirements on your system, and infringing these limits can lead
> to device damage, device life time reduction or even end user harm.
> Thus, the thermal data help to describe the hardware limits and
> what needs to be done if those limits are crosses, as part of your
> board design and non-functional requirements. Obviously that is
> very dependent on your hardware, and not all of them will have
> these non-functional requirements. Besides, describing these limits
> has *nothing* to do with how you actually find these limits.
>
> In any case, there is a need to properly represent these
> requirements and I am proposing to have this representation in
> device tree. There were already couple of counter-arguments
> claiming this is actually about configuration and performance
> profile description. But I still stand against these two readings
> of this proposal and again state that if one interprets it as
> configuration or performance profile, that is a mis-understanding
> [0]. Let me state it clear (again [1]), my proposal is to describe
> hardware thermal limits, because these limits are part of a
> hardware specification; representing in device tree would not
> infringe the original purpose of this data structure ("The Device
> Tree is a data structure for describing hardware."[2]).
>
> Before I explain my proposal, I want to highlight also that these
> data is represented elsewhere already and it is reused across
> different OS's. Thermal data is described using ACPI [3] and
> operating systems ACPI-aware do support the interpretation of
> thermal data. Linux is one example of such systems (I believe I do
> not need to enlist here all systems supporting ACPI). On the other
> hand, not all systems have ACPI or are specified to use ACPI.
> Thus, here is another reason to represent properly thermal data, so
> that we can scale across systems.
>
> In the specific case of Linux, the common thermal concepts between
> ACPI systems and non-ACPI systems have been represented in the
> thermal framework (CONFIG_THERMAL). Today, on ACPI systems, thermal
> data is fetched from bootloader with help from the common ACPI
> parser. For non-ACPI systems, the thermal data is actually coded as
> part of device drivers.
>
> So, to the point, a brief explanation of my proposal goes as
> follows: i - trip points: a node to describe a point in the
> temperature domain in which the system has to take an action. This
> node describes just the point, not the action. Properties here are
> temperature, hysteresis, and type (critical, hot, passive, active,
> etc). ii - binding parameters: the bind_param node is a node to
> describe how actions (cooling devices) get assigned to trip points.
> Cooling devices are expected to be loaded in the target system.
> Properties here are: cooling device name, weight, trip_mask and
> limits. iii - thermal zones: the thermal_zone node is the node
> containing all the required info for describing a thermal zone with
> hardware thermal limitation, including its bindings with cooling
> devices. Properties here are: type, passive_delay, polling_delay,
> governor. The thermal_zone node must contain, apart from its own
> properties, one node containing trip nodes and one node containing
> all the zone bind parameters.
>
> Here is an example (on OMAP4430): thermal_zone { type = "CPU"; mask
> = <0x03>; /* trips writability */ passive_delay = <250>; /*
> milliseconds */ polling_delay = <1000>; /* milliseconds */ governor
> = "step_wise"; trips { alert@100000{ temperature = <100000>; /*
> milliCelsius hysteresis = <2000>; /* milliCelsius */ type =
> <THERMAL_TRIP_PASSIVE>; }; crit@125000{ temperature = <125000>; /*
> milliCelsius hysteresis = <2000>; /* milliCelsius */ type =
> <THERMAL_TRIP_CRITICAL>; }; }; bind_params { action@0{
> cooling_device = "thermal-cpufreq"; weight = <100>; /* percentage
> */ mask = <0x01>; /* no limits, using defaults */ }; }; };
>
> In this current proposal, a 'thermal_zone' node would be embedded
> inside a temperature sensor node, for simplicity. But other
> possible builds could embedded them in the device with thermal
> limits (CPU nodes, for instance) or they could be not embedded in
> any specific node.
>
> A full documented description can be found here [4]. Also a branch
> containing: (a) needed changes in order to have this DT parser; (b)
> the DT parser with documentation (c) examples on how drivers could
> be changes to use the parser can be found in my branch here [5]. I
> wrote the thermal DT parser to build thermal zones with the thermal
> framework API. However, if one does not want to do that, it can
> simple do not include a CONFIG_THERMAL_OF=y in her/his build, and
> the calls will be translated to nops, and the device tree thermal
> data can be parsed to somewhere else interested (other subsystem or
> even user land). A TODO on this implementation is that it still
> lacks the representation of thermal zones composed by several
> sensors. However, I believe it is better to take an incremental
> approach here. This series can already be used to improve most of
> the existing platform thermal drivers (most are CPU thermal
> drivers) and to reuse the existing code of some hwmon sensors to
> build thermal zones for board thermal requirements.
>
> I have already posted a patch series with this proposal on [6],
> that contains a reference for the original RFC. But looks like my
> messages got moderated on device tree mailing list. Obviously,
> within PM forum, feedback was quite positive. However, we cannot
> proceed without proper assessment of other subsystems. lm-sensors
> folks (Guenter) seam to be strongly against this series, as there
> is a fear that this may introduce a mis-usage of DT. I still
> believe this is needed for hardware description, and thus not a
> infringement on DT purposes.
>
> Please let me know your thoughts on this topic and apologize me if
> my previous messages on this topic did not reach you (hope they
> reach now).
>
> All best,
>
> Eduardo Valentin
>
> [0] - https://lkml.org/lkml/2013/7/17/621 [1] -
> https://lkml.org/lkml/2013/7/18/279 [2] - http://www.devicetree.org [3] -
> http://www.acpi.info/ [4] -
> https://git.kernel.org/cgit/linux/kernel/git/evalenti/linux.git/diff/Documentation/devicetree/bindings/thermal/thermal.txt?h=thermal_work/thermal_core/dt_parser&id=405bf0b51457ed055a082af2653d7ce757bc2e91
>
>
[5] -
> https://git.kernel.org/cgit/linux/kernel/git/evalenti/linux.git/log/?h=thermal_work/thermal_core/dt_parser
>
>
[6] - https://lkml.org/lkml/2013/7/17/923
>
>

2013-07-24 10:45:50

by Mark Rutland

[permalink] [raw]
Subject: Re: RFC: device thermal limits represented in device tree nodes

On Wed, Jul 24, 2013 at 02:44:38AM +0100, Stephen Warren wrote:
> On 07/22/2013 07:25 AM, Eduardo Valentin wrote:
> > Hello Grant and Rob,
> >
> > (Resending, as I got a message saying:
> > <[email protected]>: Recipient address rejected:
> > User has moved to devicetree at vger.kernel.org)
> >
> > I am writing this email to you specifically to ask your technical
> > assessment with respect to representing device thermal limits as
> > device tree nodes. I am proposing to introduce device tree nodes to
> > describe these limits as thermal zones, their composition and their
> > relations with cooling devices and other thermal zones (thermal
> > data).
>
> Given:
> https://lkml.org/lkml/2013/7/20/69
> [PATCH 3/3] MAINTAINERS: Refactor device tree maintainership
>
> I'm explicitly CCing a few people besides Grant/Rob, and qouting the
> whole email.
>
> From my perspective, the concept of including thermal limits in DT
> seems reasonable, although I haven't looked at the proposed binding
> itself in detail yet.

The concept of defining hard thermal limits in DT certianly seems
reasonable.

>From a quick look at the version on lkml [1], it seems like this leaks a
Linux implementation details (e.g. governer names) into the binding, and
I think that the linkage of devices to thermal zones should be definedd
more explicitly. A reposting of the series to devicetree (and lakml?)
would be helpful for review.

Thanks,
Mark.

[1] https://lkml.org/lkml/2013/7/17/379

>
> > As you should know, device thermal limits are part of hardware
> > specification. Considering your board layout, mechanics, power
> > dissipation and composition of ICs, etc, that will impose thermal
> > requirements on your system, and infringing these limits can lead
> > to device damage, device life time reduction or even end user harm.
> > Thus, the thermal data help to describe the hardware limits and
> > what needs to be done if those limits are crosses, as part of your
> > board design and non-functional requirements. Obviously that is
> > very dependent on your hardware, and not all of them will have
> > these non-functional requirements. Besides, describing these limits
> > has *nothing* to do with how you actually find these limits.
> >
> > In any case, there is a need to properly represent these
> > requirements and I am proposing to have this representation in
> > device tree. There were already couple of counter-arguments
> > claiming this is actually about configuration and performance
> > profile description. But I still stand against these two readings
> > of this proposal and again state that if one interprets it as
> > configuration or performance profile, that is a mis-understanding
> > [0]. Let me state it clear (again [1]), my proposal is to describe
> > hardware thermal limits, because these limits are part of a
> > hardware specification; representing in device tree would not
> > infringe the original purpose of this data structure ("The Device
> > Tree is a data structure for describing hardware."[2]).
> >
> > Before I explain my proposal, I want to highlight also that these
> > data is represented elsewhere already and it is reused across
> > different OS's. Thermal data is described using ACPI [3] and
> > operating systems ACPI-aware do support the interpretation of
> > thermal data. Linux is one example of such systems (I believe I do
> > not need to enlist here all systems supporting ACPI). On the other
> > hand, not all systems have ACPI or are specified to use ACPI.
> > Thus, here is another reason to represent properly thermal data, so
> > that we can scale across systems.
> >
> > In the specific case of Linux, the common thermal concepts between
> > ACPI systems and non-ACPI systems have been represented in the
> > thermal framework (CONFIG_THERMAL). Today, on ACPI systems, thermal
> > data is fetched from bootloader with help from the common ACPI
> > parser. For non-ACPI systems, the thermal data is actually coded as
> > part of device drivers.
> >
> > So, to the point, a brief explanation of my proposal goes as
> > follows: i - trip points: a node to describe a point in the
> > temperature domain in which the system has to take an action. This
> > node describes just the point, not the action. Properties here are
> > temperature, hysteresis, and type (critical, hot, passive, active,
> > etc). ii - binding parameters: the bind_param node is a node to
> > describe how actions (cooling devices) get assigned to trip points.
> > Cooling devices are expected to be loaded in the target system.
> > Properties here are: cooling device name, weight, trip_mask and
> > limits. iii - thermal zones: the thermal_zone node is the node
> > containing all the required info for describing a thermal zone with
> > hardware thermal limitation, including its bindings with cooling
> > devices. Properties here are: type, passive_delay, polling_delay,
> > governor. The thermal_zone node must contain, apart from its own
> > properties, one node containing trip nodes and one node containing
> > all the zone bind parameters.
> >
> > Here is an example (on OMAP4430): thermal_zone { type = "CPU"; mask
> > = <0x03>; /* trips writability */ passive_delay = <250>; /*
> > milliseconds */ polling_delay = <1000>; /* milliseconds */ governor
> > = "step_wise"; trips { alert@100000{ temperature = <100000>; /*
> > milliCelsius hysteresis = <2000>; /* milliCelsius */ type =
> > <THERMAL_TRIP_PASSIVE>; }; crit@125000{ temperature = <125000>; /*
> > milliCelsius hysteresis = <2000>; /* milliCelsius */ type =
> > <THERMAL_TRIP_CRITICAL>; }; }; bind_params { action@0{
> > cooling_device = "thermal-cpufreq"; weight = <100>; /* percentage
> > */ mask = <0x01>; /* no limits, using defaults */ }; }; };
> >
> > In this current proposal, a 'thermal_zone' node would be embedded
> > inside a temperature sensor node, for simplicity. But other
> > possible builds could embedded them in the device with thermal
> > limits (CPU nodes, for instance) or they could be not embedded in
> > any specific node.
> >
> > A full documented description can be found here [4]. Also a branch
> > containing: (a) needed changes in order to have this DT parser; (b)
> > the DT parser with documentation (c) examples on how drivers could
> > be changes to use the parser can be found in my branch here [5]. I
> > wrote the thermal DT parser to build thermal zones with the thermal
> > framework API. However, if one does not want to do that, it can
> > simple do not include a CONFIG_THERMAL_OF=y in her/his build, and
> > the calls will be translated to nops, and the device tree thermal
> > data can be parsed to somewhere else interested (other subsystem or
> > even user land). A TODO on this implementation is that it still
> > lacks the representation of thermal zones composed by several
> > sensors. However, I believe it is better to take an incremental
> > approach here. This series can already be used to improve most of
> > the existing platform thermal drivers (most are CPU thermal
> > drivers) and to reuse the existing code of some hwmon sensors to
> > build thermal zones for board thermal requirements.
> >
> > I have already posted a patch series with this proposal on [6],
> > that contains a reference for the original RFC. But looks like my
> > messages got moderated on device tree mailing list. Obviously,
> > within PM forum, feedback was quite positive. However, we cannot
> > proceed without proper assessment of other subsystems. lm-sensors
> > folks (Guenter) seam to be strongly against this series, as there
> > is a fear that this may introduce a mis-usage of DT. I still
> > believe this is needed for hardware description, and thus not a
> > infringement on DT purposes.
> >
> > Please let me know your thoughts on this topic and apologize me if
> > my previous messages on this topic did not reach you (hope they
> > reach now).
> >
> > All best,
> >
> > Eduardo Valentin
> >
> > [0] - https://lkml.org/lkml/2013/7/17/621 [1] -
> > https://lkml.org/lkml/2013/7/18/279 [2] - http://www.devicetree.org [3] -
> > http://www.acpi.info/ [4] -
> > https://git.kernel.org/cgit/linux/kernel/git/evalenti/linux.git/diff/Documentation/devicetree/bindings/thermal/thermal.txt?h=thermal_work/thermal_core/dt_parser&id=405bf0b51457ed055a082af2653d7ce757bc2e91
> >
> >
> [5] -
> > https://git.kernel.org/cgit/linux/kernel/git/evalenti/linux.git/log/?h=thermal_work/thermal_core/dt_parser
> >
> >
> [6] - https://lkml.org/lkml/2013/7/17/923
> >
> >
>
>

2013-07-24 11:19:12

by Pawel Moll

[permalink] [raw]
Subject: Re: RFC: device thermal limits represented in device tree nodes

Greeting,

Funnily enough I had this discussion couple a months ago and made my
mind back then :-)

> On 07/22/2013 07:25 AM, Eduardo Valentin wrote:
> > representing in device tree would not
> > infringe the original purpose of this data structure ("The Device
> > Tree is a data structure for describing hardware."[2]).

I couldn't agree more, with a few remarks...

> > In this current proposal, a 'thermal_zone' node would be embedded
> > inside a temperature sensor node, for simplicity. But other
> > possible builds could embedded them in the device with thermal
> > limits (CPU nodes, for instance) or they could be not embedded in
> > any specific node.

So this is the detail that actually caused most of the disagreement :-)

My position on this can be summarized as follows:

1. As you have pointed out, the thermal limits are related to the
*device being monitored*, not the sensor itself.

2. Therefore the tree should express relation between those two; a
sensor mode should be connected (via phandle most likely) to the device
being monitored, eg. (I'm not suggesting any property names, just trying
to express the idea ;-) memory mapped PVT "sensor" monitoring a core
(cpu) temperature could look like this:
cpu0: cpu@0 {
};

sensor@xxxx {
reg = <xxx>;
device_monitored = <&cpu0>;
};
while I2C or 1Wire sensor measuring temperature of the board (or even
inside the device enclosure) would point at the root of the tree.

3. Basing on the above, the thermal limits of the device could be
described in the tree (again, don't pay attention to naming):
cpu0: cpu@0 {
thermal_limits {
cooling {
};
critical {
};
};
};
or "hardcoded" in the device driver. After all, as you have pointed out,
the limits are the device's characteristic, so I see no problem with the
"SOC driver" registering the trip points on its own. I'm sure there will
be situations when it's better to parametrize such data via Device Tree,
so it makes perfect sense to have bindings defined for such occasion.
Anyway, as long as the tree describes the relation between the monitored
device, the sensor and the cooling device all should work.

Now, my personal opinion is that the tree should focus at "system level"
data (ie. how is the device connected to others) and describe as little
information that can be probed in runtime - a small (small!) array of
silicon variants in the driver doesn't hurt (unless we're talking about
hundreds of possibilities, where the device tree is obviously the place
to have them). But this is just a rule of thumb I'm following.

Cheers!

Pawel

2013-07-24 13:25:47

by Eduardo Valentin

[permalink] [raw]
Subject: Re: RFC: device thermal limits represented in device tree nodes

On 23-07-2013 21:44, Stephen Warren wrote:
> On 07/22/2013 07:25 AM, Eduardo Valentin wrote:
>> Hello Grant and Rob,
>>
>> (Resending, as I got a message saying:
>> <[email protected]>: Recipient address rejected:
>> User has moved to devicetree at vger.kernel.org)
>>
>> I am writing this email to you specifically to ask your technical
>> assessment with respect to representing device thermal limits as
>> device tree nodes. I am proposing to introduce device tree nodes to
>> describe these limits as thermal zones, their composition and their
>> relations with cooling devices and other thermal zones (thermal
>> data).
>
> Given:
> https://lkml.org/lkml/2013/7/20/69
> [PATCH 3/3] MAINTAINERS: Refactor device tree maintainership
>
> I'm explicitly CCing a few people besides Grant/Rob, and qouting the
> whole email.
>

OK. Cool. In case thermal limits specification finds its way into DT, I
am willing to volunteer to be maintainer for the resulting bindings, in
case there is this need.

>>From my perspective, the concept of including thermal limits in DT
> seems reasonable, although I haven't looked at the proposed binding
> itself in detail yet.

Yeah, thanks Warren for your support.


>
>> As you should know, device thermal limits are part of hardware
>> specification. Considering your board layout, mechanics, power
>> dissipation and composition of ICs, etc, that will impose thermal
>> requirements on your system, and infringing these limits can lead
>> to device damage, device life time reduction or even end user harm.
>> Thus, the thermal data help to describe the hardware limits and
>> what needs to be done if those limits are crosses, as part of your
>> board design and non-functional requirements. Obviously that is
>> very dependent on your hardware, and not all of them will have
>> these non-functional requirements. Besides, describing these limits
>> has *nothing* to do with how you actually find these limits.
>>
>> In any case, there is a need to properly represent these
>> requirements and I am proposing to have this representation in
>> device tree. There were already couple of counter-arguments
>> claiming this is actually about configuration and performance
>> profile description. But I still stand against these two readings
>> of this proposal and again state that if one interprets it as
>> configuration or performance profile, that is a mis-understanding
>> [0]. Let me state it clear (again [1]), my proposal is to describe
>> hardware thermal limits, because these limits are part of a
>> hardware specification; representing in device tree would not
>> infringe the original purpose of this data structure ("The Device
>> Tree is a data structure for describing hardware."[2]).
>>
>> Before I explain my proposal, I want to highlight also that these
>> data is represented elsewhere already and it is reused across
>> different OS's. Thermal data is described using ACPI [3] and
>> operating systems ACPI-aware do support the interpretation of
>> thermal data. Linux is one example of such systems (I believe I do
>> not need to enlist here all systems supporting ACPI). On the other
>> hand, not all systems have ACPI or are specified to use ACPI.
>> Thus, here is another reason to represent properly thermal data, so
>> that we can scale across systems.
>>
>> In the specific case of Linux, the common thermal concepts between
>> ACPI systems and non-ACPI systems have been represented in the
>> thermal framework (CONFIG_THERMAL). Today, on ACPI systems, thermal
>> data is fetched from bootloader with help from the common ACPI
>> parser. For non-ACPI systems, the thermal data is actually coded as
>> part of device drivers.
>>
>> So, to the point, a brief explanation of my proposal goes as
>> follows: i - trip points: a node to describe a point in the
>> temperature domain in which the system has to take an action. This
>> node describes just the point, not the action. Properties here are
>> temperature, hysteresis, and type (critical, hot, passive, active,
>> etc). ii - binding parameters: the bind_param node is a node to
>> describe how actions (cooling devices) get assigned to trip points.
>> Cooling devices are expected to be loaded in the target system.
>> Properties here are: cooling device name, weight, trip_mask and
>> limits. iii - thermal zones: the thermal_zone node is the node
>> containing all the required info for describing a thermal zone with
>> hardware thermal limitation, including its bindings with cooling
>> devices. Properties here are: type, passive_delay, polling_delay,
>> governor. The thermal_zone node must contain, apart from its own
>> properties, one node containing trip nodes and one node containing
>> all the zone bind parameters.
>>
>> Here is an example (on OMAP4430): thermal_zone { type = "CPU"; mask
>> = <0x03>; /* trips writability */ passive_delay = <250>; /*
>> milliseconds */ polling_delay = <1000>; /* milliseconds */ governor
>> = "step_wise"; trips { alert@100000{ temperature = <100000>; /*
>> milliCelsius hysteresis = <2000>; /* milliCelsius */ type =
>> <THERMAL_TRIP_PASSIVE>; }; crit@125000{ temperature = <125000>; /*
>> milliCelsius hysteresis = <2000>; /* milliCelsius */ type =
>> <THERMAL_TRIP_CRITICAL>; }; }; bind_params { action@0{
>> cooling_device = "thermal-cpufreq"; weight = <100>; /* percentage
>> */ mask = <0x01>; /* no limits, using defaults */ }; }; };
>>
>> In this current proposal, a 'thermal_zone' node would be embedded
>> inside a temperature sensor node, for simplicity. But other
>> possible builds could embedded them in the device with thermal
>> limits (CPU nodes, for instance) or they could be not embedded in
>> any specific node.
>>
>> A full documented description can be found here [4]. Also a branch
>> containing: (a) needed changes in order to have this DT parser; (b)
>> the DT parser with documentation (c) examples on how drivers could
>> be changes to use the parser can be found in my branch here [5]. I
>> wrote the thermal DT parser to build thermal zones with the thermal
>> framework API. However, if one does not want to do that, it can
>> simple do not include a CONFIG_THERMAL_OF=y in her/his build, and
>> the calls will be translated to nops, and the device tree thermal
>> data can be parsed to somewhere else interested (other subsystem or
>> even user land). A TODO on this implementation is that it still
>> lacks the representation of thermal zones composed by several
>> sensors. However, I believe it is better to take an incremental
>> approach here. This series can already be used to improve most of
>> the existing platform thermal drivers (most are CPU thermal
>> drivers) and to reuse the existing code of some hwmon sensors to
>> build thermal zones for board thermal requirements.
>>
>> I have already posted a patch series with this proposal on [6],
>> that contains a reference for the original RFC. But looks like my
>> messages got moderated on device tree mailing list. Obviously,
>> within PM forum, feedback was quite positive. However, we cannot
>> proceed without proper assessment of other subsystems. lm-sensors
>> folks (Guenter) seam to be strongly against this series, as there
>> is a fear that this may introduce a mis-usage of DT. I still
>> believe this is needed for hardware description, and thus not a
>> infringement on DT purposes.
>>
>> Please let me know your thoughts on this topic and apologize me if
>> my previous messages on this topic did not reach you (hope they
>> reach now).
>>
>> All best,
>>
>> Eduardo Valentin
>>
>> [0] - https://lkml.org/lkml/2013/7/17/621 [1] -
>> https://lkml.org/lkml/2013/7/18/279 [2] - http://www.devicetree.org [3] -
>> http://www.acpi.info/ [4] -
>> https://git.kernel.org/cgit/linux/kernel/git/evalenti/linux.git/diff/Documentation/devicetree/bindings/thermal/thermal.txt?h=thermal_work/thermal_core/dt_parser&id=405bf0b51457ed055a082af2653d7ce757bc2e91
>>
>>
> [5] -
>> https://git.kernel.org/cgit/linux/kernel/git/evalenti/linux.git/log/?h=thermal_work/thermal_core/dt_parser
>>
>>
> [6] - https://lkml.org/lkml/2013/7/17/923
>>
>>
>
>
>


--
You have got to be excited about what you are doing. (L. Lamport)

Eduardo Valentin


Attachments:
signature.asc (295.00 B)
OpenPGP digital signature

2013-07-24 13:30:28

by Eduardo Valentin

[permalink] [raw]
Subject: Re: RFC: device thermal limits represented in device tree nodes

On 24-07-2013 06:45, Mark Rutland wrote:
> On Wed, Jul 24, 2013 at 02:44:38AM +0100, Stephen Warren wrote:
>> On 07/22/2013 07:25 AM, Eduardo Valentin wrote:
>>> Hello Grant and Rob,
>>>
>>> (Resending, as I got a message saying:
>>> <[email protected]>: Recipient address rejected:
>>> User has moved to devicetree at vger.kernel.org)
>>>
>>> I am writing this email to you specifically to ask your technical
>>> assessment with respect to representing device thermal limits as
>>> device tree nodes. I am proposing to introduce device tree nodes to
>>> describe these limits as thermal zones, their composition and their
>>> relations with cooling devices and other thermal zones (thermal
>>> data).
>>
>> Given:
>> https://lkml.org/lkml/2013/7/20/69
>> [PATCH 3/3] MAINTAINERS: Refactor device tree maintainership
>>
>> I'm explicitly CCing a few people besides Grant/Rob, and qouting the
>> whole email.
>>
>> From my perspective, the concept of including thermal limits in DT
>> seems reasonable, although I haven't looked at the proposed binding
>> itself in detail yet.
>
> The concept of defining hard thermal limits in DT certianly seems
> reasonable.

Good.

>
>>From a quick look at the version on lkml [1], it seems like this leaks a
> Linux implementation details (e.g. governer names) into the binding, and
> I think that the linkage of devices to thermal zones should be definedd
> more explicitly. A reposting of the series to devicetree (and lakml?)
> would be helpful for review.
>

On governor names, here are different approaches:
(a) - name the property 'policy' and let OS decide to interpret it.
(b) - remove this property and let OS decide what to do with thermal
zones by default.

On the linkage, there are essentially two other approaches, as I
mentioned below in the original RFC. First would be to have the
thermal_zone binding inside the node of the device requiring thermal
limits, this way the linkage would be more obvious, I think. Other
approach would be to link them by having a property on the sensor node
to the monitored device node, as suggested in other email.

> Thanks,
> Mark.
>
> [1] https://lkml.org/lkml/2013/7/17/379
>
>>
>>> As you should know, device thermal limits are part of hardware
>>> specification. Considering your board layout, mechanics, power
>>> dissipation and composition of ICs, etc, that will impose thermal
>>> requirements on your system, and infringing these limits can lead
>>> to device damage, device life time reduction or even end user harm.
>>> Thus, the thermal data help to describe the hardware limits and
>>> what needs to be done if those limits are crosses, as part of your
>>> board design and non-functional requirements. Obviously that is
>>> very dependent on your hardware, and not all of them will have
>>> these non-functional requirements. Besides, describing these limits
>>> has *nothing* to do with how you actually find these limits.
>>>
>>> In any case, there is a need to properly represent these
>>> requirements and I am proposing to have this representation in
>>> device tree. There were already couple of counter-arguments
>>> claiming this is actually about configuration and performance
>>> profile description. But I still stand against these two readings
>>> of this proposal and again state that if one interprets it as
>>> configuration or performance profile, that is a mis-understanding
>>> [0]. Let me state it clear (again [1]), my proposal is to describe
>>> hardware thermal limits, because these limits are part of a
>>> hardware specification; representing in device tree would not
>>> infringe the original purpose of this data structure ("The Device
>>> Tree is a data structure for describing hardware."[2]).
>>>
>>> Before I explain my proposal, I want to highlight also that these
>>> data is represented elsewhere already and it is reused across
>>> different OS's. Thermal data is described using ACPI [3] and
>>> operating systems ACPI-aware do support the interpretation of
>>> thermal data. Linux is one example of such systems (I believe I do
>>> not need to enlist here all systems supporting ACPI). On the other
>>> hand, not all systems have ACPI or are specified to use ACPI.
>>> Thus, here is another reason to represent properly thermal data, so
>>> that we can scale across systems.
>>>
>>> In the specific case of Linux, the common thermal concepts between
>>> ACPI systems and non-ACPI systems have been represented in the
>>> thermal framework (CONFIG_THERMAL). Today, on ACPI systems, thermal
>>> data is fetched from bootloader with help from the common ACPI
>>> parser. For non-ACPI systems, the thermal data is actually coded as
>>> part of device drivers.
>>>
>>> So, to the point, a brief explanation of my proposal goes as
>>> follows: i - trip points: a node to describe a point in the
>>> temperature domain in which the system has to take an action. This
>>> node describes just the point, not the action. Properties here are
>>> temperature, hysteresis, and type (critical, hot, passive, active,
>>> etc). ii - binding parameters: the bind_param node is a node to
>>> describe how actions (cooling devices) get assigned to trip points.
>>> Cooling devices are expected to be loaded in the target system.
>>> Properties here are: cooling device name, weight, trip_mask and
>>> limits. iii - thermal zones: the thermal_zone node is the node
>>> containing all the required info for describing a thermal zone with
>>> hardware thermal limitation, including its bindings with cooling
>>> devices. Properties here are: type, passive_delay, polling_delay,
>>> governor. The thermal_zone node must contain, apart from its own
>>> properties, one node containing trip nodes and one node containing
>>> all the zone bind parameters.
>>>
>>> Here is an example (on OMAP4430): thermal_zone { type = "CPU"; mask
>>> = <0x03>; /* trips writability */ passive_delay = <250>; /*
>>> milliseconds */ polling_delay = <1000>; /* milliseconds */ governor
>>> = "step_wise"; trips { alert@100000{ temperature = <100000>; /*
>>> milliCelsius hysteresis = <2000>; /* milliCelsius */ type =
>>> <THERMAL_TRIP_PASSIVE>; }; crit@125000{ temperature = <125000>; /*
>>> milliCelsius hysteresis = <2000>; /* milliCelsius */ type =
>>> <THERMAL_TRIP_CRITICAL>; }; }; bind_params { action@0{
>>> cooling_device = "thermal-cpufreq"; weight = <100>; /* percentage
>>> */ mask = <0x01>; /* no limits, using defaults */ }; }; };
>>>
>>> In this current proposal, a 'thermal_zone' node would be embedded
>>> inside a temperature sensor node, for simplicity. But other
>>> possible builds could embedded them in the device with thermal
>>> limits (CPU nodes, for instance) or they could be not embedded in
>>> any specific node.
>>>
>>> A full documented description can be found here [4]. Also a branch
>>> containing: (a) needed changes in order to have this DT parser; (b)
>>> the DT parser with documentation (c) examples on how drivers could
>>> be changes to use the parser can be found in my branch here [5]. I
>>> wrote the thermal DT parser to build thermal zones with the thermal
>>> framework API. However, if one does not want to do that, it can
>>> simple do not include a CONFIG_THERMAL_OF=y in her/his build, and
>>> the calls will be translated to nops, and the device tree thermal
>>> data can be parsed to somewhere else interested (other subsystem or
>>> even user land). A TODO on this implementation is that it still
>>> lacks the representation of thermal zones composed by several
>>> sensors. However, I believe it is better to take an incremental
>>> approach here. This series can already be used to improve most of
>>> the existing platform thermal drivers (most are CPU thermal
>>> drivers) and to reuse the existing code of some hwmon sensors to
>>> build thermal zones for board thermal requirements.
>>>
>>> I have already posted a patch series with this proposal on [6],
>>> that contains a reference for the original RFC. But looks like my
>>> messages got moderated on device tree mailing list. Obviously,
>>> within PM forum, feedback was quite positive. However, we cannot
>>> proceed without proper assessment of other subsystems. lm-sensors
>>> folks (Guenter) seam to be strongly against this series, as there
>>> is a fear that this may introduce a mis-usage of DT. I still
>>> believe this is needed for hardware description, and thus not a
>>> infringement on DT purposes.
>>>
>>> Please let me know your thoughts on this topic and apologize me if
>>> my previous messages on this topic did not reach you (hope they
>>> reach now).
>>>
>>> All best,
>>>
>>> Eduardo Valentin
>>>
>>> [0] - https://lkml.org/lkml/2013/7/17/621 [1] -
>>> https://lkml.org/lkml/2013/7/18/279 [2] - http://www.devicetree.org [3] -
>>> http://www.acpi.info/ [4] -
>>> https://git.kernel.org/cgit/linux/kernel/git/evalenti/linux.git/diff/Documentation/devicetree/bindings/thermal/thermal.txt?h=thermal_work/thermal_core/dt_parser&id=405bf0b51457ed055a082af2653d7ce757bc2e91
>>>
>>>
>> [5] -
>>> https://git.kernel.org/cgit/linux/kernel/git/evalenti/linux.git/log/?h=thermal_work/thermal_core/dt_parser
>>>
>>>
>> [6] - https://lkml.org/lkml/2013/7/17/923
>>>
>>>
>>
>>
>
>


--
You have got to be excited about what you are doing. (L. Lamport)

Eduardo Valentin


Attachments:
signature.asc (295.00 B)
OpenPGP digital signature

2013-07-24 15:06:17

by Eduardo Valentin

[permalink] [raw]
Subject: Re: RFC: device thermal limits represented in device tree nodes


Pawel,

On 24-07-2013 07:19, Pawel Moll wrote:
> Greeting,
>
> Funnily enough I had this discussion couple a months ago and made my
> mind back then :-)
>

:-)

>> On 07/22/2013 07:25 AM, Eduardo Valentin wrote:
>>> representing in device tree would not
>>> infringe the original purpose of this data structure ("The Device
>>> Tree is a data structure for describing hardware."[2]).
>
> I couldn't agree more, with a few remarks...
>

Nice.

>>> In this current proposal, a 'thermal_zone' node would be embedded
>>> inside a temperature sensor node, for simplicity. But other
>>> possible builds could embedded them in the device with thermal
>>> limits (CPU nodes, for instance) or they could be not embedded in
>>> any specific node.
>
> So this is the detail that actually caused most of the disagreement :-)
>
> My position on this can be summarized as follows:
>
> 1. As you have pointed out, the thermal limits are related to the
> *device being monitored*, not the sensor itself.
>

Yeah, thinking of it now, this original proposal, it lacks a stronger
relationship mapping between monitored and monitoring devices. But it
does have it..

> 2. Therefore the tree should express relation between those two; a
> sensor mode should be connected (via phandle most likely) to the device

.. this is done, more or less, by means of the 'type' property (see
original RFC binding).

> being monitored, eg. (I'm not suggesting any property names, just trying
> to express the idea ;-) memory mapped PVT "sensor" monitoring a core
> (cpu) temperature could look like this:
> cpu0: cpu@0 {
> };
>
> sensor@xxxx {
> reg = <xxx>;
> device_monitored = <&cpu0>;
> };

To, this is not a sensor property. It could be a property in the
thermal_zone node itself.

> while I2C or 1Wire sensor measuring temperature of the board (or even
> inside the device enclosure) would point at the root of the tree.
>
> 3. Basing on the above, the thermal limits of the device could be
> described in the tree (again, don't pay attention to naming):
> cpu0: cpu@0 {
> thermal_limits {
> cooling {
> };
> critical {
> };
> };
> };
> or "hardcoded" in the device driver. After all, as you have pointed out,

For example:
cpu0: cpu@0 {
/* ... cpu needed bindings */

thermal_zone {
type = "CPU";

monitoring_device = <&sensor@xxxx
&sensor@yyyy>;

mask = <0x03>; /* trips writability */
passive_delay = <250>; /* milliseconds */
polling_delay = <1000>; /* milliseconds */
policy = "step_wise";
trips {
alert@100000{
temperature = <100000>; /* milliCelsius
hysteresis = <2000>; /* milliCelsius */
type = <THERMAL_TRIP_PASSIVE>;
};
crit@125000{
temperature = <125000>; /* milliCelsius
hysteresis = <2000>; /* milliCelsius */
type = <THERMAL_TRIP_CRITICAL>;
};
};
bind_params {
action@0{
cooling_device = "thermal-cpufreq";
weight = <100>; /* percentage */
mask = <0x01>;
/* no limits, using defaults */
};
};
};
};

/* .. other binding .. */

sensor@xxxx {
reg = <xxx>;
compatible = <xxx>;
};

sensor@yyyy {
reg = <yyyy>;
compatible = <yyyy>;
};


The above way, I think it is less intrusive to sensor bindings. Besides,
it gives the flexibility to have more that one sensor monitoring a
thermal zone. In Linux we don't have this support, but in practice, we
do have this requirement. Durgadoss has proposed an improvement of the
thermal framework as a RFC. So, the support to have several sensors per
zone will come.

Another way, as I mentioned in the original RFC, an option would be to
have the thermal_zone node not embedded in any device node. But them, we
would need to firmly link it to other device nodes, to describe what is
monitored and what is used for monitoring. Something like:
thermal_zone {
type = "CPU";
monitored_device = <&cpu0>;
monitoring_device = <&sensor@xxxx
&sensor@yyyy>;
mask = <0x03>; /* trips writability */
passive_delay = <250>; /* milliseconds */
polling_delay = <1000>; /* milliseconds */
policy = "step_wise";
trips {
alert@100000{
temperature = <100000>; /* milliCelsius
hysteresis = <2000>; /* milliCelsius */
type = <THERMAL_TRIP_PASSIVE>;
};
crit@125000{
temperature = <125000>; /* milliCelsius
hysteresis = <2000>; /* milliCelsius */
type = <THERMAL_TRIP_CRITICAL>;
};
};
bind_params {
action@0{
cooling_device = "thermal-cpufreq";
weight = <100>; /* percentage */
mask = <0x01>;
/* no limits, using defaults */
};
};
};

With the above I believe we could have dts(i) files describing only
thermal, for instance.

> the limits are the device's characteristic, so I see no problem with the
> "SOC driver" registering the trip points on its own. I'm sure there will
> be situations when it's better to parametrize such data via Device Tree,
> so it makes perfect sense to have bindings defined for such occasion.
> Anyway, as long as the tree describes the relation between the monitored
> device, the sensor and the cooling device all should work.
>
> Now, my personal opinion is that the tree should focus at "system level"
> data (ie. how is the device connected to others) and describe as little
> information that can be probed in runtime - a small (small!) array of
> silicon variants in the driver doesn't hurt (unless we're talking about
> hundreds of possibilities, where the device tree is obviously the place
> to have them). But this is just a rule of thumb I'm following.
>
> Cheers!
>
> Pawel
>
>
>
>


--
You have got to be excited about what you are doing. (L. Lamport)

Eduardo Valentin


Attachments:
signature.asc (295.00 B)
OpenPGP digital signature

2013-07-25 16:15:36

by Pawel Moll

[permalink] [raw]
Subject: Re: RFC: device thermal limits represented in device tree nodes

On Wed, 2013-07-24 at 16:04 +0100, Eduardo Valentin wrote:
> > 1. As you have pointed out, the thermal limits are related to the
> > *device being monitored*, not the sensor itself.
> >
> Yeah, thinking of it now, this original proposal, it lacks a stronger
> relationship mapping between monitored and monitoring devices. But it
> does have it..
>
> > 2. Therefore the tree should express relation between those two; a
> > sensor mode should be connected (via phandle most likely) to the device
>
> .. this is done, more or less, by means of the 'type' property (see
> original RFC binding).

I'm not sure what do you mean (regarding "type"), but I think I got what
I wanted with the "monitoring_device" phandles (shouldn't it be
"_device*s*" then? ;-).

> For example:
> cpu0: cpu@0 {
> /* ... cpu needed bindings */
>
> thermal_zone {
> type = "CPU";

So what does this exactly mean? What is so special about CPU? What other
types you've got there? (Am I just lazy not looking at the numerous
links you provided? ;-)

> monitoring_device = <&sensor@xxxx
> &sensor@yyyy>;
>
> mask = <0x03>; /* trips writability */
> passive_delay = <250>; /* milliseconds */
> polling_delay = <1000>; /* milliseconds */
> policy = "step_wise";

The word "policy" doesn't sound to me like a "hardware feature",
wouldn't you agree?

> trips {
> alert@100000{
> temperature = <100000>; /* milliCelsius
> hysteresis = <2000>; /* milliCelsius */
> type = <THERMAL_TRIP_PASSIVE>;
> };
> crit@125000{
> temperature = <125000>; /* milliCelsius
> hysteresis = <2000>; /* milliCelsius */
> type = <THERMAL_TRIP_CRITICAL>;
> };
> };
> bind_params {
> action@0{
> cooling_device = "thermal-cpufreq";

Why is it a string? It seems very Linux-y... (cpufreq) Is there any
particular reason not to have phandles to the fans that have any impact
on the zone?

> weight = <100>; /* percentage */

Does this mean: how "successful" will be the particular fan?

> Another way, as I mentioned in the original RFC, an option would be to
> have the thermal_zone node not embedded in any device node. But them, we
> would need to firmly link it to other device nodes, to describe what is
> monitored and what is used for monitoring.

You mean the zone nodes would live at the top level of the tree? To my
mind the root represents the device (the board, whatever you call it),
which, I guess, may be what you want, if the zone "covers" the whole
device? (as in: one fan and one sensor somewhere in the box and the only
thing you may do is to check the ambient temperature and start the fan
if it's to high)

> With the above I believe we could have dts(i) files describing only
> thermal, for instance.

You can include a dtsi inside the device node, no problem with that:

/dts-v1/;

/ {
smb {
compatible = "simple-bus";

/include/ "vexpress-v2m-rs1.dtsi"
};
};

Pawel

2013-07-25 16:38:22

by Pawel Moll

[permalink] [raw]
Subject: Re: RFC: device thermal limits represented in device tree nodes

On Thu, 2013-07-25 at 17:15 +0100, Pawel Moll wrote:
> > Another way, as I mentioned in the original RFC, an option would be to
> > have the thermal_zone node not embedded in any device node. But them, we
> > would need to firmly link it to other device nodes, to describe what is
> > monitored and what is used for monitoring.
>
> You mean the zone nodes would live at the top level of the tree? To my
> mind the root represents the device (the board, whatever you call it),

What I wanted to say was: ... so the zone would still be embedded in a
device node :-)

Pawel

2013-07-25 17:21:28

by Eduardo Valentin

[permalink] [raw]
Subject: Re: RFC: device thermal limits represented in device tree nodes

On 25-07-2013 12:15, Pawel Moll wrote:
> On Wed, 2013-07-24 at 16:04 +0100, Eduardo Valentin wrote:
>>> 1. As you have pointed out, the thermal limits are related to the
>>> *device being monitored*, not the sensor itself.
>>>
>> Yeah, thinking of it now, this original proposal, it lacks a stronger
>> relationship mapping between monitored and monitoring devices. But it
>> does have it..
>>
>>> 2. Therefore the tree should express relation between those two; a
>>> sensor mode should be connected (via phandle most likely) to the device
>>
>> .. this is done, more or less, by means of the 'type' property (see
>> original RFC binding).
>
> I'm not sure what do you mean (regarding "type"), but I think I got what
> I wanted with the "monitoring_device" phandles (shouldn't it be
> "_device*s*" then? ;-).

yes. it is supposed to be devices.

>
>> For example:
>> cpu0: cpu@0 {
>> /* ... cpu needed bindings */
>>
>> thermal_zone {
>> type = "CPU";
>
> So what does this exactly mean? What is so special about CPU? What other
> types you've got there? (Am I just lazy not looking at the numerous
> links you provided? ;-)

Hehehe. OK. Type is supposed to describe what your zone is representing.

>
>> monitoring_device = <&sensor@xxxx
>> &sensor@yyyy>;
>>
>> mask = <0x03>; /* trips writability */
>> passive_delay = <250>; /* milliseconds */
>> polling_delay = <1000>; /* milliseconds */
>> policy = "step_wise";
>
> The word "policy" doesn't sound to me like a "hardware feature",
> wouldn't you agree?

Agreed. As I mentioned in other email, we can leave this to OS decide
what to use, by default, for instance.

>
>> trips {
>> alert@100000{
>> temperature = <100000>; /* milliCelsius
>> hysteresis = <2000>; /* milliCelsius */
>> type = <THERMAL_TRIP_PASSIVE>;
>> };
>> crit@125000{
>> temperature = <125000>; /* milliCelsius
>> hysteresis = <2000>; /* milliCelsius */
>> type = <THERMAL_TRIP_CRITICAL>;
>> };
>> };
>> bind_params {
>> action@0{
>> cooling_device = "thermal-cpufreq";
>
> Why is it a string? It seems very Linux-y... (cpufreq) Is there any
> particular reason not to have phandles to the fans that have any impact
> on the zone?
>

Because fans are not the only way to cool your system, specially those
systems that don't feature fans. Managing the speed of your CPU is one
example of lowering temperature without fans. Managing the load on your
system is another way. These are obviously, virtual concepts. And
because we have physical ways and logical ways to cool the zone, then I
didnt put a phandle to a device there.


>> weight = <100>; /* percentage */
>
> Does this mean: how "successful" will be the particular fan?
>
>> Another way, as I mentioned in the original RFC, an option would be to
>> have the thermal_zone node not embedded in any device node. But them, we
>> would need to firmly link it to other device nodes, to describe what is
>> monitored and what is used for monitoring.
>
> You mean the zone nodes would live at the top level of the tree? To my
> mind the root represents the device (the board, whatever you call it),
> which, I guess, may be what you want, if the zone "covers" the whole
> device? (as in: one fan and one sensor somewhere in the box and the only
> thing you may do is to check the ambient temperature and start the fan
> if it's to high)

Yes, for cases that we want to map zones at board level, they could be
not embedded to a specific device node.

>
>> With the above I believe we could have dts(i) files describing only
>> thermal, for instance.
>
> You can include a dtsi inside the device node, no problem with that:
>
> /dts-v1/;
>
> / {
> smb {
> compatible = "simple-bus";
>
> /include/ "vexpress-v2m-rs1.dtsi"
> };
> };

OK. got your point. What I was trying to highlight is that we could have
zones not inside device nodes, then we could write them in a thermal
dtsi. And in that case, you probably dont want to include the complete
file inside a device node.

>
> Pawel
>
>
>
>


--
You have got to be excited about what you are doing. (L. Lamport)

Eduardo Valentin


Attachments:
signature.asc (295.00 B)
OpenPGP digital signature

2013-07-25 17:22:52

by Eduardo Valentin

[permalink] [raw]
Subject: Re: RFC: device thermal limits represented in device tree nodes

On 25-07-2013 12:38, Pawel Moll wrote:
> On Thu, 2013-07-25 at 17:15 +0100, Pawel Moll wrote:
>>> Another way, as I mentioned in the original RFC, an option would be to
>>> have the thermal_zone node not embedded in any device node. But them, we
>>> would need to firmly link it to other device nodes, to describe what is
>>> monitored and what is used for monitoring.
>>
>> You mean the zone nodes would live at the top level of the tree? To my
>> mind the root represents the device (the board, whatever you call it),
>
> What I wanted to say was: ... so the zone would still be embedded in a
> device node :-)
>

ahh ok.. I see.


> Pawel
>
>
>
>


--
You have got to be excited about what you are doing. (L. Lamport)

Eduardo Valentin


Attachments:
signature.asc (295.00 B)
OpenPGP digital signature

2013-07-25 17:33:39

by Pawel Moll

[permalink] [raw]
Subject: Re: RFC: device thermal limits represented in device tree nodes

On Thu, 2013-07-25 at 18:20 +0100, Eduardo Valentin wrote:
> >> thermal_zone {
> >> type = "CPU";
> >
> > So what does this exactly mean? What is so special about CPU? What other
> > types you've got there? (Am I just lazy not looking at the numerous
> > links you provided? ;-)
>
> Hehehe. OK. Type is supposed to describe what your zone is representing.

As in "a name"? So, for example "The board", "PSU"? What I meant to ask
was: does the string carry any meaning?

> >> monitoring_device = <&sensor@xxxx
> >> &sensor@yyyy>;
> >>
> >> mask = <0x03>; /* trips writability */
> >> passive_delay = <250>; /* milliseconds */
> >> polling_delay = <1000>; /* milliseconds */
> >> policy = "step_wise";
> >
> > The word "policy" doesn't sound to me like a "hardware feature",
> > wouldn't you agree?
>
> Agreed. As I mentioned in other email, we can leave this to OS decide
> what to use, by default, for instance.

Cool, I believe it is the right thing to do.

> >
> >> trips {
> >> alert@100000{
> >> temperature = <100000>; /* milliCelsius
> >> hysteresis = <2000>; /* milliCelsius */
> >> type = <THERMAL_TRIP_PASSIVE>;
> >> };
> >> crit@125000{
> >> temperature = <125000>; /* milliCelsius
> >> hysteresis = <2000>; /* milliCelsius */
> >> type = <THERMAL_TRIP_CRITICAL>;
> >> };
> >> };
> >> bind_params {
> >> action@0{
> >> cooling_device = "thermal-cpufreq";
> >
> > Why is it a string? It seems very Linux-y... (cpufreq) Is there any
> > particular reason not to have phandles to the fans that have any impact
> > on the zone?
>
> Because fans are not the only way to cool your system, specially those
> systems that don't feature fans. Managing the speed of your CPU is one
> example of lowering temperature without fans. Managing the load on your
> system is another way. These are obviously, virtual concepts. And
> because we have physical ways and logical ways to cool the zone, then I
> didnt put a phandle to a device there.

"virtual concepts"... This is where my problem lies... It's not hardware
so it doesn't seem to belong in the tree at the first sight. Shouldn't
it focus on "physical data" instead? As in: point at devices that have
some impact on the conditions? For example, you can say "please, do the
right thing to cool your environment down" to both CPU and fan, can't
you? The "cooling driver" for the CPU would know that it has to slow
down, while a driver for the fan would know that it has to speed up ;-)

What I'm trying to say is that in my opinion the tree should simply link
the object, the sensor and the actuator. Nothing more, nothing less.

Thanks for your time!

Pawel

2013-07-26 19:57:03

by Eduardo Valentin

[permalink] [raw]
Subject: Re: RFC: device thermal limits represented in device tree nodes

On 25-07-2013 13:33, Pawel Moll wrote:
> On Thu, 2013-07-25 at 18:20 +0100, Eduardo Valentin wrote:
>>>> thermal_zone {
>>>> type = "CPU";
>>>
>>> So what does this exactly mean? What is so special about CPU? What other
>>> types you've got there? (Am I just lazy not looking at the numerous
>>> links you provided? ;-)
>>
>> Hehehe. OK. Type is supposed to describe what your zone is representing.
>
> As in "a name"? So, for example "The board", "PSU"? What I meant to ask
> was: does the string carry any meaning?
>
>>>> monitoring_device = <&sensor@xxxx
>>>> &sensor@yyyy>;
>>>>
>>>> mask = <0x03>; /* trips writability */
>>>> passive_delay = <250>; /* milliseconds */
>>>> polling_delay = <1000>; /* milliseconds */
>>>> policy = "step_wise";
>>>
>>> The word "policy" doesn't sound to me like a "hardware feature",
>>> wouldn't you agree?
>>
>> Agreed. As I mentioned in other email, we can leave this to OS decide
>> what to use, by default, for instance.
>
> Cool, I believe it is the right thing to do.
>
>>>
>>>> trips {
>>>> alert@100000{
>>>> temperature = <100000>; /* milliCelsius
>>>> hysteresis = <2000>; /* milliCelsius */
>>>> type = <THERMAL_TRIP_PASSIVE>;
>>>> };
>>>> crit@125000{
>>>> temperature = <125000>; /* milliCelsius
>>>> hysteresis = <2000>; /* milliCelsius */
>>>> type = <THERMAL_TRIP_CRITICAL>;
>>>> };
>>>> };
>>>> bind_params {
>>>> action@0{
>>>> cooling_device = "thermal-cpufreq";
>>>
>>> Why is it a string? It seems very Linux-y... (cpufreq) Is there any
>>> particular reason not to have phandles to the fans that have any impact
>>> on the zone?
>>
>> Because fans are not the only way to cool your system, specially those
>> systems that don't feature fans. Managing the speed of your CPU is one
>> example of lowering temperature without fans. Managing the load on your
>> system is another way. These are obviously, virtual concepts. And
>> because we have physical ways and logical ways to cool the zone, then I
>> didnt put a phandle to a device there.
>
> "virtual concepts"... This is where my problem lies... It's not hardware
> so it doesn't seem to belong in the tree at the first sight. Shouldn't


Yeah, in fact, this is exactly the point that creates most of the
disagreement. You may check Guenter's arguments against this proposal
(in my original RFC email, there is a link to it).

Well, if one don't want to see this as a 'virtual concept' it could say
the cooling device is the cpu itself:
cooling_device = <&cpu0>;



> it focus on "physical data" instead? As in: point at devices that have
> some impact on the conditions? For example, you can say "please, do the
> right thing to cool your environment down" to both CPU and fan, can't
> you? The "cooling driver" for the CPU would know that it has to slow
> down, while a driver for the fan would know that it has to speed up ;-)
>
> What I'm trying to say is that in my opinion the tree should simply link
> the object, the sensor and the actuator. Nothing more, nothing less.

OK. I think it would be a little unfair to have only links, without
describing what this link is supposed to be or how it is supposed to be
used. In previous discussions, I have mentioned two similar examples
already existing in DT. Here are they: regulator bindings, one does not
describe only which device connects to which regulator, but also needs
to describe, voltage limits, current limits, offsets, and other
properties. And an existing 'virtual concept' would be predefined CPU
OPPs, that feed the opp layer. Those are configurations of the hardware
that define a 'virtual' concept of operating point.

So, saying we need to describe only physical connections or touchable
things would be a little unfair, IMO. Besides, thermal is still physical
:-).

>
> Thanks for your time!

Thanks for the input!

>
> Pawel
>
>
>
>


--
You have got to be excited about what you are doing. (L. Lamport)

Eduardo Valentin


Attachments:
signature.asc (295.00 B)
OpenPGP digital signature

2013-08-06 11:14:16

by Pawel Moll

[permalink] [raw]
Subject: Re: RFC: device thermal limits represented in device tree nodes

Apologies about the delay, I was "otherwise engaged" for a week...

I hope you haven't lost all motivation to work on this subject, as it's
really worth the while!

On Fri, 2013-07-26 at 20:55 +0100, Eduardo Valentin wrote:
> On 25-07-2013 13:33, Pawel Moll wrote:
> > On Thu, 2013-07-25 at 18:20 +0100, Eduardo Valentin wrote:
> >>>> thermal_zone {
> >>>> type = "CPU";
> >>>
> >>> So what does this exactly mean? What is so special about CPU? What other
> >>> types you've got there? (Am I just lazy not looking at the numerous
> >>> links you provided? ;-)
> >>
> >> Hehehe. OK. Type is supposed to describe what your zone is representing.
> >
> > As in "a name"? So, for example "The board", "PSU"? What I meant to ask
> > was: does the string carry any meaning?

You haven't commended on this...

> >>>> trips {
> >>>> alert@100000{
> >>>> temperature = <100000>; /* milliCelsius
> >>>> hysteresis = <2000>; /* milliCelsius */
> >>>> type = <THERMAL_TRIP_PASSIVE>;
> >>>> };
> >>>> crit@125000{
> >>>> temperature = <125000>; /* milliCelsius
> >>>> hysteresis = <2000>; /* milliCelsius */
> >>>> type = <THERMAL_TRIP_CRITICAL>;
> >>>> };
> >>>> };
> >>>> bind_params {
> >>>> action@0{
> >>>> cooling_device = "thermal-cpufreq";
> >>>
> >>> Why is it a string? It seems very Linux-y... (cpufreq) Is there any
> >>> particular reason not to have phandles to the fans that have any impact
> >>> on the zone?
> >>
> >> Because fans are not the only way to cool your system, specially those
> >> systems that don't feature fans. Managing the speed of your CPU is one
> >> example of lowering temperature without fans. Managing the load on your
> >> system is another way. These are obviously, virtual concepts. And
> >> because we have physical ways and logical ways to cool the zone, then I
> >> didnt put a phandle to a device there.
> >
> > "virtual concepts"... This is where my problem lies... It's not hardware
> > so it doesn't seem to belong in the tree at the first sight. Shouldn't
>
> Yeah, in fact, this is exactly the point that creates most of the
> disagreement. You may check Guenter's arguments against this proposal
> (in my original RFC email, there is a link to it).
>
> Well, if one don't want to see this as a 'virtual concept' it could say
> the cooling device is the cpu itself:
> cooling_device = <&cpu0>;

Would this create any particular problem at the driver/framework side?

> > it focus on "physical data" instead? As in: point at devices that have
> > some impact on the conditions? For example, you can say "please, do the
> > right thing to cool your environment down" to both CPU and fan, can't
> > you? The "cooling driver" for the CPU would know that it has to slow
> > down, while a driver for the fan would know that it has to speed up ;-)
> >
> > What I'm trying to say is that in my opinion the tree should simply link
> > the object, the sensor and the actuator. Nothing more, nothing less.
>
> OK. I think it would be a little unfair to have only links, without
> describing what this link is supposed to be or how it is supposed to be
> used. In previous discussions, I have mentioned two similar examples
> already existing in DT. Here are they: regulator bindings, one does not
> describe only which device connects to which regulator, but also needs
> to describe, voltage limits, current limits, offsets, and other
> properties. And an existing 'virtual concept' would be predefined CPU
> OPPs, that feed the opp layer. Those are configurations of the hardware
> that define a 'virtual' concept of operating point.
>
> So, saying we need to describe only physical connections or touchable
> things would be a little unfair, IMO. Besides, thermal is still physical
> :-).

Believe me, I'm trying to be as fair as possible :-) and I see a lot of
value in describing the thermal properties of the platforms in the tree.
It's just that we really want to focus on describing the hardware, not
policies. And as you have already spent so much time on the matter, you
are in the best position to find the best set of *physical* properties
that would allow to make the right decision in the code. Could you,
please, try to make one step back and have another look at the problem?
What input data (as in: numbers :-) would you need to get what you want?
Are those numbers characteristic to the specific device (they probably
should live in the driver than) or to the board/platform (tree without a
doubt).

Thanks!

Pawel

2013-08-07 20:19:56

by Eduardo Valentin

[permalink] [raw]
Subject: Re: RFC: device thermal limits represented in device tree nodes

Pawel, all,

On 06-08-2013 07:14, Pawel Moll wrote:
> Apologies about the delay, I was "otherwise engaged" for a week...
>

I do also excuse for my delay, as I was also "engaged" for a week or so.

> I hope you haven't lost all motivation to work on this subject, as it's
> really worth the while!

Not really! quite the opposite. Although I was looking at some other
stuff, I got this series also tested on different boards and wrote down
a couple of improvements I will be working in the coming days. Indeed,
it is worth moving forward with this work.

>
> On Fri, 2013-07-26 at 20:55 +0100, Eduardo Valentin wrote:
>> On 25-07-2013 13:33, Pawel Moll wrote:
>>> On Thu, 2013-07-25 at 18:20 +0100, Eduardo Valentin wrote:
>>>>>> thermal_zone {
>>>>>> type = "CPU";
>>>>>
>>>>> So what does this exactly mean? What is so special about CPU? What other
>>>>> types you've got there? (Am I just lazy not looking at the numerous
>>>>> links you provided? ;-)
>>>>
>>>> Hehehe. OK. Type is supposed to describe what your zone is representing.
>>>
>>> As in "a name"? So, for example "The board", "PSU"? What I meant to ask
>>> was: does the string carry any meaning?
>
> You haven't commended on this...

The string is supposed to carry meaning, yes. Couple of common used:
CPU, GPU, PCB, LCD

>
>>>>>> trips {
>>>>>> alert@100000{
>>>>>> temperature = <100000>; /* milliCelsius
>>>>>> hysteresis = <2000>; /* milliCelsius */
>>>>>> type = <THERMAL_TRIP_PASSIVE>;
>>>>>> };
>>>>>> crit@125000{
>>>>>> temperature = <125000>; /* milliCelsius
>>>>>> hysteresis = <2000>; /* milliCelsius */
>>>>>> type = <THERMAL_TRIP_CRITICAL>;
>>>>>> };
>>>>>> };
>>>>>> bind_params {
>>>>>> action@0{
>>>>>> cooling_device = "thermal-cpufreq";
>>>>>
>>>>> Why is it a string? It seems very Linux-y... (cpufreq) Is there any
>>>>> particular reason not to have phandles to the fans that have any impact
>>>>> on the zone?
>>>>
>>>> Because fans are not the only way to cool your system, specially those
>>>> systems that don't feature fans. Managing the speed of your CPU is one
>>>> example of lowering temperature without fans. Managing the load on your
>>>> system is another way. These are obviously, virtual concepts. And
>>>> because we have physical ways and logical ways to cool the zone, then I
>>>> didnt put a phandle to a device there.
>>>
>>> "virtual concepts"... This is where my problem lies... It's not hardware
>>> so it doesn't seem to belong in the tree at the first sight. Shouldn't
>>
>> Yeah, in fact, this is exactly the point that creates most of the
>> disagreement. You may check Guenter's arguments against this proposal
>> (in my original RFC email, there is a link to it).
>>
>> Well, if one don't want to see this as a 'virtual concept' it could say
>> the cooling device is the cpu itself:
>> cooling_device = <&cpu0>;
>
> Would this create any particular problem at the driver/framework side?

In this case, I believe CPUfreq driver must be thermal aware in this
case. And we need to cook a way to, whenever there is such link, the
cpufreq driver instantiates the cooling mechanism. But I need to think a
little bit more on this, will come back on this point soon.

>
>>> it focus on "physical data" instead? As in: point at devices that have
>>> some impact on the conditions? For example, you can say "please, do the
>>> right thing to cool your environment down" to both CPU and fan, can't
>>> you? The "cooling driver" for the CPU would know that it has to slow
>>> down, while a driver for the fan would know that it has to speed up ;-)
>>>
>>> What I'm trying to say is that in my opinion the tree should simply link
>>> the object, the sensor and the actuator. Nothing more, nothing less.
>>
>> OK. I think it would be a little unfair to have only links, without
>> describing what this link is supposed to be or how it is supposed to be
>> used. In previous discussions, I have mentioned two similar examples
>> already existing in DT. Here are they: regulator bindings, one does not
>> describe only which device connects to which regulator, but also needs
>> to describe, voltage limits, current limits, offsets, and other
>> properties. And an existing 'virtual concept' would be predefined CPU
>> OPPs, that feed the opp layer. Those are configurations of the hardware
>> that define a 'virtual' concept of operating point.
>>
>> So, saying we need to describe only physical connections or touchable
>> things would be a little unfair, IMO. Besides, thermal is still physical
>> :-).
>
> Believe me, I'm trying to be as fair as possible :-) and I see a lot of
> value in describing the thermal properties of the platforms in the tree.
> It's just that we really want to focus on describing the hardware, not
> policies. And as you have already spent so much time on the matter, you
> are in the best position to find the best set of *physical* properties
> that would allow to make the right decision in the code. Could you,
> please, try to make one step back and have another look at the problem?
> What input data (as in: numbers :-) would you need to get what you want?
> Are those numbers characteristic to the specific device (they probably
> should live in the driver than) or to the board/platform (tree without a
> doubt).

Ok. My point was just that linking objects without telling when (in
temperature domain) to start using this link, may be incomplete, because
trip points are really HW dependent, because your power dissipation
profile changes from HW design to HW design. In other words, one can
say, "use device fan to cool the GPU", but depending on your GPU, you
may start using your fan when the sensor is 100C, 85C, 110C (just
picking numbers), it is really HW dependent. Besides, the same IP may be
used immersed in different ambient condition, which may cause variance
on its leakage level, and thus changes its thermal dissipation profile,
resulting in different trip points. Thus, mapping all this HW
characteristics inside specific drivers does not sound the right path to
me. That is why I believe pointing which trips to use is part of HW
description.

I must agree that, saying which governor to use, then again, is pure
policy definition though.

I will go through our discussion, consider the points risen and repost
the series as soon as possible. I will not let this one fall into the
cracks, no worries.

Thanks for your inputs, BTW.

>
> Thanks!
>
> Pawel
>
>
>
>


--
You have got to be excited about what you are doing. (L. Lamport)

Eduardo Valentin


Attachments:
signature.asc (295.00 B)
OpenPGP digital signature

2013-08-08 08:53:55

by Mark Rutland

[permalink] [raw]
Subject: Re: RFC: device thermal limits represented in device tree nodes

On Wed, Aug 07, 2013 at 09:18:29PM +0100, Eduardo Valentin wrote:
> Pawel, all,
>
> On 06-08-2013 07:14, Pawel Moll wrote:
> > Apologies about the delay, I was "otherwise engaged" for a week...
> >
>
> I do also excuse for my delay, as I was also "engaged" for a week or so.
>
> > I hope you haven't lost all motivation to work on this subject, as it's
> > really worth the while!
>
> Not really! quite the opposite. Although I was looking at some other
> stuff, I got this series also tested on different boards and wrote down
> a couple of improvements I will be working in the coming days. Indeed,
> it is worth moving forward with this work.
>
> >
> > On Fri, 2013-07-26 at 20:55 +0100, Eduardo Valentin wrote:
> >> On 25-07-2013 13:33, Pawel Moll wrote:
> >>> On Thu, 2013-07-25 at 18:20 +0100, Eduardo Valentin wrote:
> >>>>>> thermal_zone {
> >>>>>> type = "CPU";
> >>>>>
> >>>>> So what does this exactly mean? What is so special about CPU? What other
> >>>>> types you've got there? (Am I just lazy not looking at the numerous
> >>>>> links you provided? ;-)
> >>>>
> >>>> Hehehe. OK. Type is supposed to describe what your zone is representing.
> >>>
> >>> As in "a name"? So, for example "The board", "PSU"? What I meant to ask
> >>> was: does the string carry any meaning?
> >
> > You haven't commended on this...
>
> The string is supposed to carry meaning, yes. Couple of common used:
> CPU, GPU, PCB, LCD

I think the point Pawel was getting at is that the string doesn't have a
*well-defined* meaning that always allows an OS to figure out the set of
relevant devices. If we have a thermal zone for "LCD", and have multiple
LCDs, which LCDs are covered? If we have a "PCB" zone, does this cover all
the devices attached to the PCB, a subset thereof, or the substrate of
the PCB itself?

A phandle list to the concerned devices for example would *always* allow
the OS to figure out the set of concerned devices.

>
> >
> >>>>>> trips {
> >>>>>> alert@100000{
> >>>>>> temperature = <100000>; /* milliCelsius
> >>>>>> hysteresis = <2000>; /* milliCelsius */
> >>>>>> type = <THERMAL_TRIP_PASSIVE>;
> >>>>>> };
> >>>>>> crit@125000{
> >>>>>> temperature = <125000>; /* milliCelsius
> >>>>>> hysteresis = <2000>; /* milliCelsius */
> >>>>>> type = <THERMAL_TRIP_CRITICAL>;
> >>>>>> };
> >>>>>> };
> >>>>>> bind_params {
> >>>>>> action@0{
> >>>>>> cooling_device = "thermal-cpufreq";
> >>>>>
> >>>>> Why is it a string? It seems very Linux-y... (cpufreq) Is there any
> >>>>> particular reason not to have phandles to the fans that have any impact
> >>>>> on the zone?
> >>>>
> >>>> Because fans are not the only way to cool your system, specially those
> >>>> systems that don't feature fans. Managing the speed of your CPU is one
> >>>> example of lowering temperature without fans. Managing the load on your
> >>>> system is another way. These are obviously, virtual concepts. And
> >>>> because we have physical ways and logical ways to cool the zone, then I
> >>>> didnt put a phandle to a device there.
> >>>
> >>> "virtual concepts"... This is where my problem lies... It's not hardware
> >>> so it doesn't seem to belong in the tree at the first sight. Shouldn't
> >>
> >> Yeah, in fact, this is exactly the point that creates most of the
> >> disagreement. You may check Guenter's arguments against this proposal
> >> (in my original RFC email, there is a link to it).
> >>
> >> Well, if one don't want to see this as a 'virtual concept' it could say
> >> the cooling device is the cpu itself:
> >> cooling_device = <&cpu0>;
> >
> > Would this create any particular problem at the driver/framework side?
>
> In this case, I believe CPUfreq driver must be thermal aware in this
> case. And we need to cook a way to, whenever there is such link, the
> cpufreq driver instantiates the cooling mechanism. But I need to think a
> little bit more on this, will come back on this point soon.

For the CPU case at least, if we have a thermal zone with an attached
sensor and no real thermal device, I don't think it's too much of a
stretch for the OS to figure out that the only thing it can do is some
cpufreq-like scaling and limiting to attempt to keep below the limits.

I don't think we need to describe the cpu as a cooling device for a cpu,
and I certainly don't think we need to describe a particular mechanism
by which the thermal limits should be maintained (though when we have a
fan, or other active cooling device, describing to the OS that this may
be used is sensible).

>
> >
> >>> it focus on "physical data" instead? As in: point at devices that have
> >>> some impact on the conditions? For example, you can say "please, do the
> >>> right thing to cool your environment down" to both CPU and fan, can't
> >>> you? The "cooling driver" for the CPU would know that it has to slow
> >>> down, while a driver for the fan would know that it has to speed up ;-)
> >>>
> >>> What I'm trying to say is that in my opinion the tree should simply link
> >>> the object, the sensor and the actuator. Nothing more, nothing less.
> >>
> >> OK. I think it would be a little unfair to have only links, without
> >> describing what this link is supposed to be or how it is supposed to be
> >> used. In previous discussions, I have mentioned two similar examples
> >> already existing in DT. Here are they: regulator bindings, one does not
> >> describe only which device connects to which regulator, but also needs
> >> to describe, voltage limits, current limits, offsets, and other
> >> properties. And an existing 'virtual concept' would be predefined CPU
> >> OPPs, that feed the opp layer. Those are configurations of the hardware
> >> that define a 'virtual' concept of operating point.
> >>
> >> So, saying we need to describe only physical connections or touchable
> >> things would be a little unfair, IMO. Besides, thermal is still physical
> >> :-).
> >
> > Believe me, I'm trying to be as fair as possible :-) and I see a lot of
> > value in describing the thermal properties of the platforms in the tree.
> > It's just that we really want to focus on describing the hardware, not
> > policies. And as you have already spent so much time on the matter, you
> > are in the best position to find the best set of *physical* properties
> > that would allow to make the right decision in the code. Could you,
> > please, try to make one step back and have another look at the problem?
> > What input data (as in: numbers :-) would you need to get what you want?
> > Are those numbers characteristic to the specific device (they probably
> > should live in the driver than) or to the board/platform (tree without a
> > doubt).
>
> Ok. My point was just that linking objects without telling when (in
> temperature domain) to start using this link, may be incomplete, because
> trip points are really HW dependent, because your power dissipation
> profile changes from HW design to HW design. In other words, one can
> say, "use device fan to cool the GPU", but depending on your GPU, you
> may start using your fan when the sensor is 100C, 85C, 110C (just
> picking numbers), it is really HW dependent. Besides, the same IP may be
> used immersed in different ambient condition, which may cause variance
> on its leakage level, and thus changes its thermal dissipation profile,
> resulting in different trip points. Thus, mapping all this HW
> characteristics inside specific drivers does not sound the right path to
> me. That is why I believe pointing which trips to use is part of HW
> description.

This sounds like the case we have with OPPs in that the information
required to figure out the valid points at which an OS can drive the
hardware is driven by rather complex interactions between various
components, and it's not clear whether describing all that information
is worthwhile. That said, I'm not sure what my opinion on the matter is.

>
> I must agree that, saying which governor to use, then again, is pure
> policy definition though.

Sounds like we agree there :)

Thanks,
Mark.

2013-08-08 13:45:26

by Pawel Moll

[permalink] [raw]
Subject: Re: RFC: device thermal limits represented in device tree nodes

On Thu, 2013-08-08 at 09:53 +0100, Mark Rutland wrote:
> On Wed, Aug 07, 2013 at 09:18:29PM +0100, Eduardo Valentin wrote:
> > Pawel, all,
> >
> > On 06-08-2013 07:14, Pawel Moll wrote:
> > > Apologies about the delay, I was "otherwise engaged" for a week...
> > >
> >
> > I do also excuse for my delay, as I was also "engaged" for a week or so.
> >
> > > I hope you haven't lost all motivation to work on this subject, as it's
> > > really worth the while!
> >
> > Not really! quite the opposite. Although I was looking at some other
> > stuff, I got this series also tested on different boards and wrote down
> > a couple of improvements I will be working in the coming days. Indeed,
> > it is worth moving forward with this work.
> >
> > >
> > > On Fri, 2013-07-26 at 20:55 +0100, Eduardo Valentin wrote:
> > >> On 25-07-2013 13:33, Pawel Moll wrote:
> > >>> On Thu, 2013-07-25 at 18:20 +0100, Eduardo Valentin wrote:
> > >>>>>> thermal_zone {
> > >>>>>> type = "CPU";
> > >>>>>
> > >>>>> So what does this exactly mean? What is so special about CPU? What other
> > >>>>> types you've got there? (Am I just lazy not looking at the numerous
> > >>>>> links you provided? ;-)
> > >>>>
> > >>>> Hehehe. OK. Type is supposed to describe what your zone is representing.
> > >>>
> > >>> As in "a name"? So, for example "The board", "PSU"? What I meant to ask
> > >>> was: does the string carry any meaning?
> > >
> > > You haven't commended on this...
> >
> > The string is supposed to carry meaning, yes. Couple of common used:
> > CPU, GPU, PCB, LCD
>
> I think the point Pawel was getting at is that the string doesn't have a
> *well-defined* meaning that always allows an OS to figure out the set of
> relevant devices. If we have a thermal zone for "LCD", and have multiple
> LCDs, which LCDs are covered? If we have a "PCB" zone, does this cover all
> the devices attached to the PCB, a subset thereof, or the substrate of
> the PCB itself?

Or: what happens if I type "lcd" instead of "LCD"? Would there be any
decision made based on this string? Or is it just a label to be used
somewhere in debugging messages?

Paweł