2010-07-22 16:22:34

by Fenghua Yu

[permalink] [raw]
Subject: [PATCH 5/5] Package Level Thermal Control and Power Limit Notification: pkgtemp doc


Document for package level thermal hwmon driver.

Signed-off-by: Fenghua Yu <[email protected]>
Reviewed-by: Len Brown <[email protected]>
---

pkgtemp | 36 ++++++++++++++++++++++++++++++++++++
1 files changed, 36 insertions(+)

diff --git a/Documentation/hwmon/pkgtemp b/Documentation/hwmon/pkgtemp
new file mode 100644
index 0000000..a60d286
--- /dev/null
+++ b/Documentation/hwmon/pkgtemp
@@ -0,0 +1,36 @@
+Kernel driver pkgtemp
+======================
+
+Supported chips:
+ * Intel family
+ Prefix: 'pkgtemp'
+ CPUID:
+ Datasheet: Intel 64 and IA-32 Architectures Software Developer's Manual
+ Volume 3A: System Programming Guide
+
+Author: Fenghua Yu
+
+Description
+-----------
+
+This driver permits reading package level temperature sensor embedded inside
+Intel CPU package. The sensors can be in core, uncore, memroy controller, or
+other componenets in a package. The feature is first implemented in Intel Sandy
+Bridge platform.
+
+Temperature is measured in degrees Celsius and measurement resolution is
+1 degree C. Valid temperatures are from 0 to TjMax degrees C, because the actual
+value of temperature register is in fact a delta from TjMax.
+
+Temperature known as TjMax is the maximum junction temperature of package.
+Intel defines this temperature as 125C. At this temperature, protection
+mechanism will perform actions to forcibly cool down the processor. Alarm
+may be raised, if the temperature grows enough (more than TjMax) to trigger
+the Out-Of-Spec bit. Following table summarizes the exported sysfs files:
+
+temp1_input - Package temperature (in millidegrees Celsius).
+temp1_crit - Maximum junction temperature (in millidegrees Celsius).
+temp1_crit_alarm - Set when Out-of-spec bit is set, never clears.
+ Correct CPU operation is no longer guaranteed.
+temp1_label - Contains string "Pysical package id X", where X is physical
+ package id.


2010-07-22 17:28:31

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 5/5] Package Level Thermal Control and Power Limit Notification: pkgtemp doc

On Thu, Jul 22, 2010 at 12:22:23PM -0400, Fenghua Yu wrote:
>
> Document for package level thermal hwmon driver.
>
> Signed-off-by: Fenghua Yu <[email protected]>
> Reviewed-by: Len Brown <[email protected]>
> ---
>
> pkgtemp | 36 ++++++++++++++++++++++++++++++++++++
> 1 files changed, 36 insertions(+)
>
> diff --git a/Documentation/hwmon/pkgtemp b/Documentation/hwmon/pkgtemp
> new file mode 100644
> index 0000000..a60d286
> --- /dev/null
> +++ b/Documentation/hwmon/pkgtemp
> @@ -0,0 +1,36 @@
> +Kernel driver pkgtemp
> +======================
> +
> +Supported chips:
> + * Intel family
> + Prefix: 'pkgtemp'
> + CPUID:
> + Datasheet: Intel 64 and IA-32 Architectures Software Developer's Manual
> + Volume 3A: System Programming Guide
> +
> +Author: Fenghua Yu
> +
> +Description
> +-----------
> +
> +This driver permits reading package level temperature sensor embedded inside
> +Intel CPU package. The sensors can be in core, uncore, memroy controller, or

memroy --> memory

> +other componenets in a package. The feature is first implemented in Intel Sandy

componenets --> components

> +Bridge platform.
> +
Just for clarification - you mention a number of sensors, but unless
I am missing something only the package sensor is implemented. Is that correct ?

> +Temperature is measured in degrees Celsius and measurement resolution is
> +1 degree C. Valid temperatures are from 0 to TjMax degrees C, because the actual
> +value of temperature register is in fact a delta from TjMax.
> +
>From the code, it seems that negative values can be reported. Is it guaranteed
by the chip that (TjMax - MSR_IA32_TEMPERATURE_TARGET) >= 0 ? Otherwise, the
minimum temperature would be (TjMax - 127).

> +Temperature known as TjMax is the maximum junction temperature of package.
> +Intel defines this temperature as 125C. At this temperature, protection

Your driver bails out at TjMax >= 120, so there is some inconsistency.
Also, it seems that this is not a constant, since you are reading
MSR_IA32_TEMPERATURE_TARGET to get the value.

Since the CPUs supporting the package sensor presumably also all support
reading TjMax, maybe you can reword the above text to reflect this.

> +mechanism will perform actions to forcibly cool down the processor. Alarm
> +may be raised, if the temperature grows enough (more than TjMax) to trigger
> +the Out-Of-Spec bit. Following table summarizes the exported sysfs files:
> +
> +temp1_input - Package temperature (in millidegrees Celsius).
> +temp1_crit - Maximum junction temperature (in millidegrees Celsius).
> +temp1_crit_alarm - Set when Out-of-spec bit is set, never clears.
> + Correct CPU operation is no longer guaranteed.
> +temp1_label - Contains string "Pysical package id X", where X is physical

Pysical --> Physical.
I would suggest to drop the "id" for consistency. Core sensor names don't
include "id" either.

I am not sure if "physical" should be included in the first place.
Also, above description suggests that future CPUs might add more sensors.
If so, the name should probably reflect the location of the current sensor,
ie be something like "Package core X" or "Core package X" or "Package X (core)".

> + package id.

2010-07-22 17:52:27

by Fenghua Yu

[permalink] [raw]
Subject: Re: [PATCH 5/5] Package Level Thermal Control and Power Limit Notification: pkgtemp doc

> > +This driver permits reading package level temperature sensor embedded inside
> > +Intel CPU package. The sensors can be in core, uncore, memroy controller, or
>
> memroy --> memory
Will change the typo.

> > +other componenets in a package. The feature is first implemented in Intel Sandy
>
> componenets --> components
>
Ditto.

> > +Bridge platform.
> > +
> Just for clarification - you mention a number of sensors, but unless
> I am missing something only the package sensor is implemented. Is that correct ?
Do you mean "only the core sensor"? I think the Sandy Bridge only implements
core sensor now. I would think in the future more sensors could be implemented.
If only core sensors are implemented, there is no need to have the package
level thermal hw MSRs because OS can get all package level thermal info by
enumerating all core sensors in a package.

>
> > +Temperature is measured in degrees Celsius and measurement resolution is
> > +1 degree C. Valid temperatures are from 0 to TjMax degrees C, because the actual
> > +value of temperature register is in fact a delta from TjMax.
> > +
> From the code, it seems that negative values can be reported. Is it guaranteed
> by the chip that (TjMax - MSR_IA32_TEMPERATURE_TARGET) >= 0 ? Otherwise, the
> minimum temperature would be (TjMax - 127).
>
> > +Temperature known as TjMax is the maximum junction temperature of package.
> > +Intel defines this temperature as 125C. At this temperature, protection
>
> Your driver bails out at TjMax >= 120, so there is some inconsistency.
> Also, it seems that this is not a constant, since you are reading
> MSR_IA32_TEMPERATURE_TARGET to get the value.
>
> Since the CPUs supporting the package sensor presumably also all support
> reading TjMax, maybe you can reword the above text to reflect this.
>
You are right. I forgot to update this part. I'll reword it.

> > +mechanism will perform actions to forcibly cool down the processor. Alarm
> > +may be raised, if the temperature grows enough (more than TjMax) to trigger
> > +the Out-Of-Spec bit. Following table summarizes the exported sysfs files:
> > +
> > +temp1_input - Package temperature (in millidegrees Celsius).
> > +temp1_crit - Maximum junction temperature (in millidegrees Celsius).
> > +temp1_crit_alarm - Set when Out-of-spec bit is set, never clears.
> > + Correct CPU operation is no longer guaranteed.
> > +temp1_label - Contains string "Pysical package id X", where X is physical
>
> Pysical --> Physical.
> I would suggest to drop the "id" for consistency. Core sensor names don't
> include "id" either.
>
> I am not sure if "physical" should be included in the first place.
> Also, above description suggests that future CPUs might add more sensors.
> If so, the name should probably reflect the location of the current sensor,
> ie be something like "Package core X" or "Core package X" or "Package X (core)".
How about just using "package X" to align with coretemp for now? Currently there
is no interface to tell which sensor it is.

Thanks.

-Fenghua

2010-07-22 18:59:23

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 5/5] Package Level Thermal Control and Power Limit Notification: pkgtemp doc

On Thu, Jul 22, 2010 at 01:52:25PM -0400, Fenghua Yu wrote:
> > > +This driver permits reading package level temperature sensor embedded inside
> > > +Intel CPU package. The sensors can be in core, uncore, memroy controller, or
> >
> > memroy --> memory
> Will change the typo.
>
> > > +other componenets in a package. The feature is first implemented in Intel Sandy
> >
> > componenets --> components
> >
> Ditto.
>
> > > +Bridge platform.
> > > +
> > Just for clarification - you mention a number of sensors, but unless
> > I am missing something only the package sensor is implemented. Is that correct ?
> Do you mean "only the core sensor"? I think the Sandy Bridge only implements
> core sensor now. I would think in the future more sensors could be implemented.
> If only core sensors are implemented, there is no need to have the package
> level thermal hw MSRs because OS can get all package level thermal info by
> enumerating all core sensors in a package.
>
Now you lost me. I understand there are the per-core sensors, read through coretemp,
plus one package level sensor. Do you associate this existing package level sensor
with any of the locations mentioned above (core, uncore, memory, other), or is
that unknown ? If it is unknown, you might want to mention it.

> >
> > > +Temperature is measured in degrees Celsius and measurement resolution is
> > > +1 degree C. Valid temperatures are from 0 to TjMax degrees C, because the actual
> > > +value of temperature register is in fact a delta from TjMax.
> > > +
> > From the code, it seems that negative values can be reported. Is it guaranteed
> > by the chip that (TjMax - MSR_IA32_TEMPERATURE_TARGET) >= 0 ? Otherwise, the
> > minimum temperature would be (TjMax - 127).
> >
> > > +Temperature known as TjMax is the maximum junction temperature of package.
> > > +Intel defines this temperature as 125C. At this temperature, protection
> >
> > Your driver bails out at TjMax >= 120, so there is some inconsistency.
> > Also, it seems that this is not a constant, since you are reading
> > MSR_IA32_TEMPERATURE_TARGET to get the value.
> >
> > Since the CPUs supporting the package sensor presumably also all support
> > reading TjMax, maybe you can reword the above text to reflect this.
> >
> You are right. I forgot to update this part. I'll reword it.
>
> > > +mechanism will perform actions to forcibly cool down the processor. Alarm
> > > +may be raised, if the temperature grows enough (more than TjMax) to trigger
> > > +the Out-Of-Spec bit. Following table summarizes the exported sysfs files:
> > > +
> > > +temp1_input - Package temperature (in millidegrees Celsius).
> > > +temp1_crit - Maximum junction temperature (in millidegrees Celsius).
> > > +temp1_crit_alarm - Set when Out-of-spec bit is set, never clears.
> > > + Correct CPU operation is no longer guaranteed.
> > > +temp1_label - Contains string "Pysical package id X", where X is physical
> >
> > Pysical --> Physical.
> > I would suggest to drop the "id" for consistency. Core sensor names don't
> > include "id" either.
> >
> > I am not sure if "physical" should be included in the first place.
> > Also, above description suggests that future CPUs might add more sensors.
> > If so, the name should probably reflect the location of the current sensor,
> > ie be something like "Package core X" or "Core package X" or "Package X (core)".
> How about just using "package X" to align with coretemp for now? Currently there
> is no interface to tell which sensor it is.
>
Fine with me. Maybe wait for feedback from others.

You use the argument that there may be other package level sensors in the future.
Are there any plans for this, or is this just a theory ?

Next question is how to handle future sensor types. One hwmon instance per sensor,
additional sensors in this driver, or even a new driver ?

We had was a separate discussion if the coretemp driver should be redesigned
to one instance per CPU. The package sensor would fit into that model,
since you would have

coretemp-isa-0000
Core0
Core1
...
CoreN
Package

coretemp-isa-0001
Core0
Core1
...
CoreM
Package

I personally would prefer that approach. It would avoid ambiguity associating Package X
with specific cores, and it would also easily expand to additional non-core future sensors.

Guenter

2010-07-22 21:21:13

by Fenghua Yu

[permalink] [raw]
Subject: Re: [PATCH 5/5] Package Level Thermal Control and Power Limit Notification: pkgtemp doc

On Thu, Jul 22, 2010 at 11:58:14AM -0700, Guenter Roeck wrote:
> On Thu, Jul 22, 2010 at 01:52:25PM -0400, Fenghua Yu wrote:
> > > Just for clarification - you mention a number of sensors, but unless
> > > I am missing something only the package sensor is implemented. Is that correct ?
> > Do you mean "only the core sensor"? I think the Sandy Bridge only implements
> > core sensor now. I would think in the future more sensors could be implemented.
> > If only core sensors are implemented, there is no need to have the package
> > level thermal hw MSRs because OS can get all package level thermal info by
> > enumerating all core sensors in a package.
> >
> Now you lost me. I understand there are the per-core sensors, read through coretemp,
> plus one package level sensor. Do you associate this existing package level sensor
> with any of the locations mentioned above (core, uncore, memory, other), or is
> that unknown ? If it is unknown, you might want to mention it.

There are more than one package level sensor. The package level MSRs take into
account the maximum across the whole package, which encompasses all the cores
and other integrated components like memory controller, uncore, graphics, etc.
Sandy Bridge implements not only core sensors but also other sensors. I take
back what I said that Sandy Bridge only implements core sensors.


>
> > >
> > > > +Temperature is measured in degrees Celsius and measurement resolution is
> > > > +1 degree C. Valid temperatures are from 0 to TjMax degrees C, because the actual
> > > > +value of temperature register is in fact a delta from TjMax.
> > > > +
> > > From the code, it seems that negative values can be reported. Is it guaranteed
> > > by the chip that (TjMax - MSR_IA32_TEMPERATURE_TARGET) >= 0 ? Otherwise, the
> > > minimum temperature would be (TjMax - 127).
> > >
> > > > +Temperature known as TjMax is the maximum junction temperature of package.
> > > > +Intel defines this temperature as 125C. At this temperature, protection
> > >
> > > Your driver bails out at TjMax >= 120, so there is some inconsistency.
> > > Also, it seems that this is not a constant, since you are reading
> > > MSR_IA32_TEMPERATURE_TARGET to get the value.
> > >
> > > Since the CPUs supporting the package sensor presumably also all support
> > > reading TjMax, maybe you can reword the above text to reflect this.
> > >
> > You are right. I forgot to update this part. I'll reword it.
> >
> > > > +mechanism will perform actions to forcibly cool down the processor. Alarm
> > > > +may be raised, if the temperature grows enough (more than TjMax) to trigger
> > > > +the Out-Of-Spec bit. Following table summarizes the exported sysfs files:
> > > > +
> > > > +temp1_input - Package temperature (in millidegrees Celsius).
> > > > +temp1_crit - Maximum junction temperature (in millidegrees Celsius).
> > > > +temp1_crit_alarm - Set when Out-of-spec bit is set, never clears.
> > > > + Correct CPU operation is no longer guaranteed.
> > > > +temp1_label - Contains string "Pysical package id X", where X is physical
> > >
> > > Pysical --> Physical.
> > > I would suggest to drop the "id" for consistency. Core sensor names don't
> > > include "id" either.
> > >
> > > I am not sure if "physical" should be included in the first place.
> > > Also, above description suggests that future CPUs might add more sensors.
> > > If so, the name should probably reflect the location of the current sensor,
> > > ie be something like "Package core X" or "Core package X" or "Package X (core)".
> > How about just using "package X" to align with coretemp for now? Currently there
> > is no interface to tell which sensor it is.
> >
> Fine with me. Maybe wait for feedback from others.
>
> You use the argument that there may be other package level sensors in the future.
> Are there any plans for this, or is this just a theory ?
>

Not just a theory. Sandy Bridge already implements other package level sensors.
If really need to know exactly which sensors are implemented, we might go
through a channel before releasing the info.

> Next question is how to handle future sensor types. One hwmon instance per sensor,
> additional sensors in this driver, or even a new driver ?

Currently package level thermal just reports the maximum temperature across
the package. Which sensor is reporting the highest temperature is not exposed.

>
> We had was a separate discussion if the coretemp driver should be redesigned
> to one instance per CPU. The package sensor would fit into that model,
> since you would have
>
> coretemp-isa-0000
> Core0
> Core1
> ...
> CoreN
> Package
>
> coretemp-isa-0001
> Core0
> Core1
> ...
> CoreM
> Package
>
> I personally would prefer that approach. It would avoid ambiguity associating Package X
> with specific cores, and it would also easily expand to additional non-core future sensors.

Package X shows Physical id which is unique in platform topology and can be
got from cpuinfo. Package X doesn't have that problem, right?

Maybe instead of showing "Package X", pkgtemp may show "Physical id" just like
what cpuinfo shows?

2010-08-19 15:47:04

by Jean Delvare

[permalink] [raw]
Subject: Re: [PATCH 5/5] Package Level Thermal Control and Power Limit Notification: pkgtemp doc

Hi Fenghua, Guenter,

Sorry for joining the discussion a little late, I was on vacation when
it happened. I'll comment now, it's probably "too late" as the patch
set was merged meanwhile, but still...

On Thu, 22 Jul 2010 14:21:11 -0700, Fenghua Yu wrote:
> On Thu, Jul 22, 2010 at 11:58:14AM -0700, Guenter Roeck wrote:
> > You use the argument that there may be other package level sensors in the future.
> > Are there any plans for this, or is this just a theory ?
>
> Not just a theory. Sandy Bridge already implements other package level sensors.
> If really need to know exactly which sensors are implemented, we might go
> through a channel before releasing the info.
>
> > Next question is how to handle future sensor types. One hwmon instance per sensor,
> > additional sensors in this driver, or even a new driver ?
>
> Currently package level thermal just reports the maximum temperature across
> the package. Which sensor is reporting the highest temperature is not exposed.

So this isn't a real physical sensor, but more of a meta-sensor? If
this is a case, then we don't need support for this at all. User-space
can compute a maximum by itself, we don't need a dedicated kernel
driver for that.

> > We had was a separate discussion if the coretemp driver should be redesigned
> > to one instance per CPU. The package sensor would fit into that model,
> > since you would have
> >
> > coretemp-isa-0000
> > Core0
> > Core1
> > ...
> > CoreN
> > Package
> >
> > coretemp-isa-0001
> > Core0
> > Core1
> > ...
> > CoreM
> > Package
> >
> > I personally would prefer that approach. It would avoid ambiguity associating Package X
> > with specific cores, and it would also easily expand to additional non-core future sensors.

For the records, I totally support this approach. I want the coretemp
driver to be updated to present a single hwmon device per CPU, no
matter what happens to the "package temperature".

> Package X shows Physical id which is unique in platform topology and can be
> got from cpuinfo. Package X doesn't have that problem, right?
>
> Maybe instead of showing "Package X", pkgtemp may show "Physical id" just like
> what cpuinfo shows?

What Guenter meant IMHO has nothing to do with labelling or numbering.
It has to do with presenting related temperature values (that belong to
the same physical CPU) as a single hwmon entry. This is definitely the
most obvious way to present a group of related temperatures to the user.

--
Jean Delvare

2010-08-19 16:37:12

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 5/5] Package Level Thermal Control and Power Limit Notification: pkgtemp doc

On Thu, Aug 19, 2010 at 11:46:32AM -0400, Jean Delvare wrote:
> Hi Fenghua, Guenter,
>
> Sorry for joining the discussion a little late, I was on vacation when
> it happened. I'll comment now, it's probably "too late" as the patch
> set was merged meanwhile, but still...
>
There was no discussion at all, unfortunately.

> On Thu, 22 Jul 2010 14:21:11 -0700, Fenghua Yu wrote:
> > On Thu, Jul 22, 2010 at 11:58:14AM -0700, Guenter Roeck wrote:
> > > You use the argument that there may be other package level sensors in the future.
> > > Are there any plans for this, or is this just a theory ?
> >
> > Not just a theory. Sandy Bridge already implements other package level sensors.
> > If really need to know exactly which sensors are implemented, we might go
> > through a channel before releasing the info.
> >
> > > Next question is how to handle future sensor types. One hwmon instance per sensor,
> > > additional sensors in this driver, or even a new driver ?
> >
> > Currently package level thermal just reports the maximum temperature across
> > the package. Which sensor is reporting the highest temperature is not exposed.
>
> So this isn't a real physical sensor, but more of a meta-sensor? If
> this is a case, then we don't need support for this at all. User-space
> can compute a maximum by itself, we don't need a dedicated kernel
> driver for that.
>
> > > We had was a separate discussion if the coretemp driver should be redesigned
> > > to one instance per CPU. The package sensor would fit into that model,
> > > since you would have
> > >
> > > coretemp-isa-0000
> > > Core0
> > > Core1
> > > ...
> > > CoreN
> > > Package
> > >
> > > coretemp-isa-0001
> > > Core0
> > > Core1
> > > ...
> > > CoreM
> > > Package
> > >
> > > I personally would prefer that approach. It would avoid ambiguity associating Package X
> > > with specific cores, and it would also easily expand to additional non-core future sensors.
>
> For the records, I totally support this approach. I want the coretemp
> driver to be updated to present a single hwmon device per CPU, no
> matter what happens to the "package temperature".
>
I might spend some time rewriting the coretemp driver as described above,
unless someone else picks it up, and unless there is opposition.
Obviously, that won't include the package sensor since there is now
a separate driver for it.

Guenter

2010-08-19 20:51:25

by Fenghua Yu

[permalink] [raw]
Subject: Re: [PATCH 5/5] Package Level Thermal Control and Power Limit Notification: pkgtemp doc

On Thu, Aug 19, 2010 at 09:27:19AM -0700, Guenter Roeck wrote:
> On Thu, Aug 19, 2010 at 11:46:32AM -0400, Jean Delvare wrote:
> > Hi Fenghua, Guenter,
> >
> > Sorry for joining the discussion a little late, I was on vacation when
> > it happened. I'll comment now, it's probably "too late" as the patch
> > set was merged meanwhile, but still...
> >
> There was no discussion at all, unfortunately.
>
> > On Thu, 22 Jul 2010 14:21:11 -0700, Fenghua Yu wrote:
> > > On Thu, Jul 22, 2010 at 11:58:14AM -0700, Guenter Roeck wrote:
> > > > You use the argument that there may be other package level sensors in the future.
> > > > Are there any plans for this, or is this just a theory ?
> > >
> > > Not just a theory. Sandy Bridge already implements other package level sensors.
> > > If really need to know exactly which sensors are implemented, we might go
> > > through a channel before releasing the info.
> > >
> > > > Next question is how to handle future sensor types. One hwmon instance per sensor,
> > > > additional sensors in this driver, or even a new driver ?
> > >
> > > Currently package level thermal just reports the maximum temperature across
> > > the package. Which sensor is reporting the highest temperature is not exposed.
> >
> > So this isn't a real physical sensor, but more of a meta-sensor? If
> > this is a case, then we don't need support for this at all. User-space
> > can compute a maximum by itself, we don't need a dedicated kernel
> > driver for that.
> >
The pkgtemp reports thermal status for a set of sensors in a package. Please
note the sensors in a package are not limited to processor sensors which are
handled by coretemp. The sensors in a package also include gfx sensors, cache
sensors, memory controllor sensors which are not handled by any hwmon drivers.

OS gets maximum package thermal status only from MSR PACKAGE_THERM_STATUS.
There is no detailed thermal info for each sensor in a package. User-space
can't compute a maximum by itself. So a piece of kernel driver code, whether
a seperate driver or a integrated driver, is necessary if user-space wants to
know thermal status of a package.

> > > > We had was a separate discussion if the coretemp driver should be redesigned
> > > > to one instance per CPU. The package sensor would fit into that model,
> > > > since you would have
> > > >
> > > > coretemp-isa-0000
> > > > Core0
> > > > Core1
> > > > ...
> > > > CoreN
> > > > Package
> > > >
> > > > coretemp-isa-0001
> > > > Core0
> > > > Core1
> > > > ...
> > > > CoreM
> > > > Package
> > > >
> > > > I personally would prefer that approach. It would avoid ambiguity associating Package X
> > > > with specific cores, and it would also easily expand to additional non-core future sensors.
> >
> > For the records, I totally support this approach. I want the coretemp
> > driver to be updated to present a single hwmon device per CPU, no
> > matter what happens to the "package temperature".
> >
> I might spend some time rewriting the coretemp driver as described above,
> unless someone else picks it up, and unless there is opposition.
> Obviously, that won't include the package sensor since there is now
> a separate driver for it.
I agree with this method too. On a multiple socket system, the current coretemp
output will cause confusion since it only outputs core# without package#.

If it's ok for you, I can rewrite this part to have hwmon device per CPU with
both core and package thermal info and send out RFC patch soon.

Thanks.

-Fenghua

2010-08-19 21:07:47

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 5/5] Package Level Thermal Control and Power Limit Notification: pkgtemp doc

On Thu, Aug 19, 2010 at 04:51:20PM -0400, Fenghua Yu wrote:
> On Thu, Aug 19, 2010 at 09:27:19AM -0700, Guenter Roeck wrote:
> > On Thu, Aug 19, 2010 at 11:46:32AM -0400, Jean Delvare wrote:
> > > Hi Fenghua, Guenter,
> > >
> > > Sorry for joining the discussion a little late, I was on vacation when
> > > it happened. I'll comment now, it's probably "too late" as the patch
> > > set was merged meanwhile, but still...
> > >
> > There was no discussion at all, unfortunately.
> >
> > > On Thu, 22 Jul 2010 14:21:11 -0700, Fenghua Yu wrote:
> > > > On Thu, Jul 22, 2010 at 11:58:14AM -0700, Guenter Roeck wrote:
> > > > > You use the argument that there may be other package level sensors in the future.
> > > > > Are there any plans for this, or is this just a theory ?
> > > >
> > > > Not just a theory. Sandy Bridge already implements other package level sensors.
> > > > If really need to know exactly which sensors are implemented, we might go
> > > > through a channel before releasing the info.
> > > >
> > > > > Next question is how to handle future sensor types. One hwmon instance per sensor,
> > > > > additional sensors in this driver, or even a new driver ?
> > > >
> > > > Currently package level thermal just reports the maximum temperature across
> > > > the package. Which sensor is reporting the highest temperature is not exposed.
> > >
> > > So this isn't a real physical sensor, but more of a meta-sensor? If
> > > this is a case, then we don't need support for this at all. User-space
> > > can compute a maximum by itself, we don't need a dedicated kernel
> > > driver for that.
> > >
> The pkgtemp reports thermal status for a set of sensors in a package. Please
> note the sensors in a package are not limited to processor sensors which are
> handled by coretemp. The sensors in a package also include gfx sensors, cache
> sensors, memory controllor sensors which are not handled by any hwmon drivers.
>
> OS gets maximum package thermal status only from MSR PACKAGE_THERM_STATUS.
> There is no detailed thermal info for each sensor in a package. User-space
> can't compute a maximum by itself. So a piece of kernel driver code, whether
> a seperate driver or a integrated driver, is necessary if user-space wants to
> know thermal status of a package.
>
> > > > > We had was a separate discussion if the coretemp driver should be redesigned
> > > > > to one instance per CPU. The package sensor would fit into that model,
> > > > > since you would have
> > > > >
> > > > > coretemp-isa-0000
> > > > > Core0
> > > > > Core1
> > > > > ...
> > > > > CoreN
> > > > > Package
> > > > >
> > > > > coretemp-isa-0001
> > > > > Core0
> > > > > Core1
> > > > > ...
> > > > > CoreM
> > > > > Package
> > > > >
> > > > > I personally would prefer that approach. It would avoid ambiguity associating Package X
> > > > > with specific cores, and it would also easily expand to additional non-core future sensors.
> > >
> > > For the records, I totally support this approach. I want the coretemp
> > > driver to be updated to present a single hwmon device per CPU, no
> > > matter what happens to the "package temperature".
> > >
> > I might spend some time rewriting the coretemp driver as described above,
> > unless someone else picks it up, and unless there is opposition.
> > Obviously, that won't include the package sensor since there is now
> > a separate driver for it.
> I agree with this method too. On a multiple socket system, the current coretemp
> output will cause confusion since it only outputs core# without package#.
>
> If it's ok for you, I can rewrite this part to have hwmon device per CPU with
> both core and package thermal info and send out RFC patch soon.
>
Sounds good to me.

Guenter

2010-08-20 08:34:21

by Jean Delvare

[permalink] [raw]
Subject: Re: [PATCH 5/5] Package Level Thermal Control and Power Limit Notification: pkgtemp doc

Fenghua,

On Thu, 19 Aug 2010 13:51:20 -0700, Fenghua Yu wrote:
> On Thu, Aug 19, 2010 at 09:27:19AM -0700, Guenter Roeck wrote:
> > I might spend some time rewriting the coretemp driver as described above,
> > unless someone else picks it up, and unless there is opposition.
> > Obviously, that won't include the package sensor since there is now
> > a separate driver for it.
>
> I agree with this method too. On a multiple socket system, the current coretemp
> output will cause confusion since it only outputs core# without package#.

Good point.

> If it's ok for you, I can rewrite this part to have hwmon device per CPU with
> both core and package thermal info and send out RFC patch soon.

Yes, please! If you have time to work on this, it would be very great.
I am really curious to see how the driver would look like if we go with
this approach. I can test the code, too (although I understand you
won't have any difficulties getting your hands on recent Intel
systems ;)

Also see my reply in the other thread about the handling of removed
siblings. I suspect it will be very easy to add to the new design.

Side question: is it safe to assume a maximum of 2 siblings per core on
Intel x86 CPUs?

--
Jean Delvare

2010-08-20 16:58:50

by Fenghua Yu

[permalink] [raw]
Subject: Re: [PATCH 5/5] Package Level Thermal Control and Power Limit Notification: pkgtemp doc

On Fri, Aug 20, 2010 at 01:33:56AM -0700, Jean Delvare wrote:
> Fenghua,
>
> On Thu, 19 Aug 2010 13:51:20 -0700, Fenghua Yu wrote:
> > On Thu, Aug 19, 2010 at 09:27:19AM -0700, Guenter Roeck wrote:
> > > I might spend some time rewriting the coretemp driver as described above,
> > > unless someone else picks it up, and unless there is opposition.
> > > Obviously, that won't include the package sensor since there is now
> > > a separate driver for it.
> >
> > I agree with this method too. On a multiple socket system, the current coretemp
> > output will cause confusion since it only outputs core# without package#.
>
> Good point.
>
> > If it's ok for you, I can rewrite this part to have hwmon device per CPU with
> > both core and package thermal info and send out RFC patch soon.
>
> Yes, please! If you have time to work on this, it would be very great.
> I am really curious to see how the driver would look like if we go with
> this approach. I can test the code, too (although I understand you
> won't have any difficulties getting your hands on recent Intel
> systems ;)
>
> Also see my reply in the other thread about the handling of removed
> siblings. I suspect it will be very easy to add to the new design.
>
> Side question: is it safe to assume a maximum of 2 siblings per core on
> Intel x86 CPUs?
I think architecturally it's not safe to assume 2 siblings per core on x86
although so far HT implementations have been having 2 siblings per core.

Linux kernel doesn't assume 2 siblings per core during initialization (please
check arch/x86/kernel/smpboot.c). This is right way to handle potential non 2
sibling case in the future.

Thanks.

-Fenghua

2010-08-20 18:39:57

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 5/5] Package Level Thermal Control and Power Limit Notification: pkgtemp doc

On 08/20/2010 01:33 AM, Jean Delvare wrote:
>
> Side question: is it safe to assume a maximum of 2 siblings per core on
> Intel x86 CPUs?
>

That is true for all current CPUs, but is not guaranteed for the future.

-hpa

2010-08-21 10:03:12

by Jean Delvare

[permalink] [raw]
Subject: Re: [PATCH 5/5] Package Level Thermal Control and Power Limit Notification: pkgtemp doc

One more thing...

On Thu, 19 Aug 2010 13:51:20 -0700, Fenghua Yu wrote:
> The pkgtemp reports thermal status for a set of sensors in a package. Please
> note the sensors in a package are not limited to processor sensors which are
> handled by coretemp. The sensors in a package also include gfx sensors, cache
> sensors, memory controllor sensors which are not handled by any hwmon drivers.

By "package" you mean CPU package, right? So by "gfx sensors" and
"memory controllor sensors", you refer to features possibly embedded
into the CPU package, not sensors outside the CPU, right?

> OS gets maximum package thermal status only from MSR PACKAGE_THERM_STATUS.
> There is no detailed thermal info for each sensor in a package. User-space
> can't compute a maximum by itself. So a piece of kernel driver code, whether
> a seperate driver or a integrated driver, is necessary if user-space wants to
> know thermal status of a package.

OK, now I understand, thanks for the clarification.

--
Jean Delvare