2008-06-22 00:47:20

by Rene Herman

[permalink] [raw]
Subject: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

Good day.

On 2.6.26-rc and perhaps earlier, when I enable the ACPI Thermal Zone
support (CONFIG_ACPI_THERMAL) I see in dmesg:

ACPI: LNXTHERM:01 is registered as thermal_zone0
ACPI: Thermal Zone [THRM] (56 C)

My /sys/class/hwmon/hwmon0 (a W83782D chip) becomes hwmon1, there's a
new /sys/class/hwmon/hwmon0 and "sensors -s" craps out with:

# sensors -s
Can't access procfs/sysfs file
Kernel interface access error
For 2.6 kernels, make sure you have mounted sysfs and libsensors
was compiled with sysfs support!

# sensors --version
sensors version 2.10.6 with libsensors version 2.10.6

This is the slackware 12.1 (recent) standard version. What's wrong?

In case it's useful, my /etc/sensors.conf is at:

http://members.home.nl/rene.herman/sensors.conf

Rene.


2008-06-22 07:23:29

by Hans de Goede

[permalink] [raw]
Subject: Re: [lm-sensors] LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

Rene Herman wrote:
> Good day.
>
> On 2.6.26-rc and perhaps earlier, when I enable the ACPI Thermal Zone
> support (CONFIG_ACPI_THERMAL) I see in dmesg:
>
> ACPI: LNXTHERM:01 is registered as thermal_zone0
> ACPI: Thermal Zone [THRM] (56 C)
>
> My /sys/class/hwmon/hwmon0 (a W83782D chip) becomes hwmon1, there's a
> new /sys/class/hwmon/hwmon0 and "sensors -s" craps out with:
>
> # sensors -s
> Can't access procfs/sysfs file
> Kernel interface access error
> For 2.6 kernels, make sure you have mounted sysfs and libsensors
> was compiled with sysfs support!
>
> # sensors --version
> sensors version 2.10.6 with libsensors version 2.10.6
>
> This is the slackware 12.1 (recent) standard version. What's wrong?
>
> In case it's useful, my /etc/sensors.conf is at:
>
> http://members.home.nl/rene.herman/sensors.conf
>

I'm pretty sure this caused by your lm_sensors using space being to old to
support the new thermalzone stuff. You need atleast 3.0.2 to support the
thermalzone driver.

Regards,

Hans

2008-06-22 07:24:45

by Hans de Goede

[permalink] [raw]
Subject: Re: [lm-sensors] LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

Rene Herman wrote:
> Good day.
>
> On 2.6.26-rc and perhaps earlier, when I enable the ACPI Thermal Zone
> support (CONFIG_ACPI_THERMAL) I see in dmesg:
>
> ACPI: LNXTHERM:01 is registered as thermal_zone0
> ACPI: Thermal Zone [THRM] (56 C)
>
> My /sys/class/hwmon/hwmon0 (a W83782D chip) becomes hwmon1, there's a
> new /sys/class/hwmon/hwmon0 and "sensors -s" craps out with:
>
> # sensors -s
> Can't access procfs/sysfs file
> Kernel interface access error
> For 2.6 kernels, make sure you have mounted sysfs and libsensors
> was compiled with sysfs support!
>
> # sensors --version
> sensors version 2.10.6 with libsensors version 2.10.6
>
> This is the slackware 12.1 (recent) standard version. What's wrong?
>
> In case it's useful, my /etc/sensors.conf is at:
>
> http://members.home.nl/rene.herman/sensors.conf
>

I'm pretty sure this caused by your lm_sensors using space being too old to
support the new thermalzone stuff.

<Correction that should ofcourse read "userspace" not "using space", so the
correct reply I was trying to send is>:

I'm pretty sure this caused by your lm_sensors using space being too old to
support the new thermalzone stuff. You need atleast 3.0.2 to support the
thermalzone driver.

Regards,

Hans

2008-06-22 13:15:05

by Rene Herman

[permalink] [raw]
Subject: [REGRESSION, ABI] Re: [lm-sensors] LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

On 22-06-08 09:28, Hans de Goede wrote:

This is an ABI breakage issue and an unfortunate one at that:

> Rene Herman wrote:

>> On 2.6.26-rc and perhaps earlier, when I enable the ACPI Thermal Zone
>> support (CONFIG_ACPI_THERMAL) I see in dmesg:

Note, 2.6.25-rc7 works fine with it enabled.

>> ACPI: LNXTHERM:01 is registered as thermal_zone0
>> ACPI: Thermal Zone [THRM] (56 C)
>>
>> My /sys/class/hwmon/hwmon0 (a W83782D chip) becomes hwmon1, there's a
>> new /sys/class/hwmon/hwmon0 and "sensors -s" craps out with:
>>
>> # sensors -s
>> Can't access procfs/sysfs file
>> Kernel interface access error
>> For 2.6 kernels, make sure you have mounted sysfs and libsensors
>> was compiled with sysfs support!
>>
>> # sensors --version
>> sensors version 2.10.6 with libsensors version 2.10.6
>>
>> This is the slackware 12.1 (recent) standard version. What's wrong?
>>
>> In case it's useful, my /etc/sensors.conf is at:
>>
>> http://members.home.nl/rene.herman/sensors.conf
>
> I'm pretty sure this caused by your lm_sensors using space being to old
> to support the new thermalzone stuff. You need atleast 3.0.2 to support
> the thermalzone driver.

I see. I was about to mark this up as Volkerding doing his usual "if it
has a lower version number it must be better" thing but in this case it
seems it's hwmon or ACPI which is to blame.

Firstly -- with CONFIG_ACPI_THERMAL selected my sensors work fine on
2.6.25-rc7 with the above 2.10.6 lm_sensors userspace. Now, with
2.6.26-rc (*) they do not as per above.

This is ABI breakage. I wouldn't care if my older lm_sensors userspace
couldn't handle the ACPI Thermal Zone, but I do care that even having it
breaks my other sensors; especially given the CONFIG_ACPI_THERMAL help
text which can not be read as recommending to disable it:

This driver adds support for ACPI thermal zones. Most mobile and
some desktop systems support ACPI thermal zones. It is HIGHLY
recommended that this option be enabled, as your processor(s)
may be damaged without it.

Now, I'm actually usally a big fan of not dragging around old gunk
forever, ABI be damned, but in this case this really won't do. 2.6.10 is
a recent maintenance release and I see for the new 3.0 branch:

http://www.lm-sensors.org/wiki/Download

===
Most third party monitoring applications do not yet work with the
library in this package. We are encouraging authors to port their
applications to the new library. We already have patches for xsensors
0.60, gkrellm-2.3.0, net-snmp-5.4.1 (configure with
--with-mib-modules="ucd-snmp/lmsensorsMib" --with-ldflags="-lsensors"),
xfce4-sensors-plugin-0.10.99.2, kdebase-3.5.8(ksysguard),
sensors-applet-1.8.1 and ksensors-0.7.3-fedora-14.tar.gz (upstream is
dead this tarbal contains a version with all Debian's changes + 2
patches from Fedora, including lm_sensors-3.x support).
===

So it seems we have here a change to the kernel requiring a userspace
basically noone is ready for and which at least the (again, recent)
slackware 12.1 doesn't ship as a result. This is ABI breakage of the
really bad kind.

If there's not just something I'm missing, could someone please get
Linus a patch ASAP making whatever breaks lm_sensors 2 optional,
disabled by default and with a help text that warns that enabling it
requires a new lm_sensors userspace?

I haven't seen other complaints about this and would've expected them so
it might ofcourse be possible that I'm just missing something and have a
very specific problem; if in that case someone could advice what it
might be -- please do.

But if not, .26 is around the corner and requiring libsensors-3.0 must
really not be.

(*) 2.6.26-rc7 at least. I actually noticed this early in the -rc stage
but had too many other breakage at that point to worry about this one. I
just disabled the ACPI Thermal Zone support and forgot about it upto
this point.

Rene.

2008-06-22 13:22:39

by Rene Herman

[permalink] [raw]
Subject: Re: [REGRESSION, ABI] Re: [lm-sensors] LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

On 22-06-08 15:15, Rene Herman wrote:

> On 22-06-08 09:28, Hans de Goede wrote:
>
> This is an ABI breakage issue and an unfortunate one at that:
>
>> Rene Herman wrote:
>
>>> On 2.6.26-rc and perhaps earlier, when I enable the ACPI Thermal Zone
>>> support (CONFIG_ACPI_THERMAL) I see in dmesg:
>
> Note, 2.6.25-rc7 works fine with it enabled.

Sorry -- 2.6.25.7 I mean.

>>> ACPI: LNXTHERM:01 is registered as thermal_zone0
>>> ACPI: Thermal Zone [THRM] (56 C)
>>>
>>> My /sys/class/hwmon/hwmon0 (a W83782D chip) becomes hwmon1, there's a
>>> new /sys/class/hwmon/hwmon0 and "sensors -s" craps out with:
>>>
>>> # sensors -s
>>> Can't access procfs/sysfs file
>>> Kernel interface access error
>>> For 2.6 kernels, make sure you have mounted sysfs and libsensors
>>> was compiled with sysfs support!
>>>
>>> # sensors --version
>>> sensors version 2.10.6 with libsensors version 2.10.6
>>>
>>> This is the slackware 12.1 (recent) standard version. What's wrong?
>>>
>>> In case it's useful, my /etc/sensors.conf is at:
>>>
>>> http://members.home.nl/rene.herman/sensors.conf
>>
>> I'm pretty sure this caused by your lm_sensors using space being to
>> old to support the new thermalzone stuff. You need atleast 3.0.2 to
>> support the thermalzone driver.
>
> I see. I was about to mark this up as Volkerding doing his usual "if it
> has a lower version number it must be better" thing but in this case it
> seems it's hwmon or ACPI which is to blame.
>
> Firstly -- with CONFIG_ACPI_THERMAL selected my sensors work fine on
> 2.6.25-rc7 with the above 2.10.6 lm_sensors userspace. Now, with
> 2.6.26-rc (*) they do not as per above.
>
> This is ABI breakage. I wouldn't care if my older lm_sensors userspace
> couldn't handle the ACPI Thermal Zone, but I do care that even having it
> breaks my other sensors; especially given the CONFIG_ACPI_THERMAL help
> text which can not be read as recommending to disable it:
>
> This driver adds support for ACPI thermal zones. Most mobile and
> some desktop systems support ACPI thermal zones. It is HIGHLY
> recommended that this option be enabled, as your processor(s)
> may be damaged without it.
>
> Now, I'm actually usally a big fan of not dragging around old gunk
> forever, ABI be damned, but in this case this really won't do. 2.6.10 is
> a recent maintenance release and I see for the new 3.0 branch:
>
> http://www.lm-sensors.org/wiki/Download
>
> ===
> Most third party monitoring applications do not yet work with the
> library in this package. We are encouraging authors to port their
> applications to the new library. We already have patches for xsensors
> 0.60, gkrellm-2.3.0, net-snmp-5.4.1 (configure with
> --with-mib-modules="ucd-snmp/lmsensorsMib" --with-ldflags="-lsensors"),
> xfce4-sensors-plugin-0.10.99.2, kdebase-3.5.8(ksysguard),
> sensors-applet-1.8.1 and ksensors-0.7.3-fedora-14.tar.gz (upstream is
> dead this tarbal contains a version with all Debian's changes + 2
> patches from Fedora, including lm_sensors-3.x support).
> ===
>
> So it seems we have here a change to the kernel requiring a userspace
> basically noone is ready for and which at least the (again, recent)
> slackware 12.1 doesn't ship as a result. This is ABI breakage of the
> really bad kind.
>
> If there's not just something I'm missing, could someone please get
> Linus a patch ASAP making whatever breaks lm_sensors 2 optional,
> disabled by default and with a help text that warns that enabling it
> requires a new lm_sensors userspace?
>
> I haven't seen other complaints about this and would've expected them so
> it might ofcourse be possible that I'm just missing something and have a
> very specific problem; if in that case someone could advice what it
> might be -- please do.
>
> But if not, .26 is around the corner and requiring libsensors-3.0 must
> really not be.
>
> (*) 2.6.26-rc7 at least. I actually noticed this early in the -rc stage
> but had too many other breakage at that point to worry about this one. I
> just disabled the ACPI Thermal Zone support and forgot about it upto
> this point.

Rene.

2008-06-22 14:24:33

by Hans de Goede

[permalink] [raw]
Subject: Re: [lm-sensors] [REGRESSION, ABI] Re: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

Rene Herman wrote:
> This is an ABI breakage issue and an unfortunate one at that:
>

No it is not, in 2.6.26rcX, the acpi thermalzones have grown a hwmon interface,
that is they register a hwmon device so that "sensors" and other lm_sensors ABI
compliant using applications can read the zone temperatures using an existing
ABI instead of adding yet another ABI.

The problem is that the hwmon entries for the thermalzone device lack a
"device" symlink under /sys/class/hwmon/hwmonX, as they are not tied to a
specific device. lm_sensors-2.10.x barfs on this, which would not be a problem
if it would simply skip with the new hwmon which it does not understand
(because of the missing device link, which is not a part of the documented
ABI), but instead of skipping it, if I understand you correctly it aborts and
never gets to hwmon1 (which is 100% unchanged and should still work fine).

I wonder what just plain "sensors" (without the -s) does.

Still this is an issue that needs fixing, but not on the kernel side, but
rather on the lm_sensors-2.10.x side, I guess its time for a new 2.10.x release.

>> I'm pretty sure this caused by your lm_sensors using space being to old
>> to support the new thermalzone stuff. You need atleast 3.0.2 to support
>> the thermalzone driver.
>
> I see. I was about to mark this up as Volkerding doing his usual "if it
> has a lower version number it must be better" thing but in this case it
> seems it's hwmon or ACPI which is to blame.
>

Actually its lm_sensors userspace which is to blame, as instead of skipping the
new hwmon device which it doesn't understand it aborts (atleast that is what I
understand from what you're telling us).

> This is ABI breakage. I wouldn't care if my older lm_sensors userspace
> couldn't handle the ACPI Thermal Zone, but I do care that even having it
> breaks my other sensors;

Yes, and that is exactly the part which is an lm_sensors userspace bug.

> Now, I'm actually usally a big fan of not dragging around old gunk
> forever, ABI be damned, but in this case this really won't do. 2.6.10 is
> a recent maintenance release and I see for the new 3.0 branch:
>
> http://www.lm-sensors.org/wiki/Download
>
> ===
> Most third party monitoring applications do not yet work with the
> library in this package. We are encouraging authors to port their
> applications to the new library. We already have patches for xsensors
> 0.60, gkrellm-2.3.0, net-snmp-5.4.1 (configure with
> --with-mib-modules="ucd-snmp/lmsensorsMib" --with-ldflags="-lsensors"),
> xfce4-sensors-plugin-0.10.99.2, kdebase-3.5.8(ksysguard),
> sensors-applet-1.8.1 and ksensors-0.7.3-fedora-14.tar.gz (upstream is
> dead this tarbal contains a version with all Debian's changes + 2
> patches from Fedora, including lm_sensors-3.x support).
> ===
>
> So it seems we have here a change to the kernel requiring a userspace
> basically noone is ready for

Erm if you look at that same page you will notice there are links to patches
for almost every userspace package which uses lm_sensors, I know as I wrote
most of them, quite a few of them have been integrated by their resp. upstreams.

Also "noone is ready for"? lm_sensors-3.0.x is the default in both Fedora 9 and
OpenSuse 11.

> But if not, .26 is around the corner and requiring libsensors-3.0 must
> really not be.
>

I agree that requiring libsensors-3.0.2 for this is not a good solution, but I
don't want to be crippling the kernel for what I believe is a bug in 2.10.x either.

So we need todo 2 or 3 things:
1) Find out if this really is as big an issue as you make it, maybe
"sensors -s" is rightfully complaining about hwmon0 and then still happily
doing its job for hwmon1?

Again, what does plain "sensors" say? Does it still show the hwmon1
readings, and are the limits what they should be after sensors -s?

2) Fix this in the 2.10.x series (which are still supported) and do a new
2.10.x release.

3) If this really completely messes 2.10.x and in that case I'm afraid we will
have to make kernel side changes.

Regards,

Hans

2008-06-22 15:25:40

by Rene Herman

[permalink] [raw]
Subject: Re: [lm-sensors] [REGRESSION, ABI] Re: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
index 4b62852..f6a7652 100644
--- a/drivers/thermal/Kconfig
+++ b/drivers/thermal/Kconfig
@@ -12,3 +12,12 @@ menuconfig THERMAL
cooling devices.
All platforms with ACPI thermal support can use this driver.
If you want this support, you should say Y or M here.
+
+config THERMAL_HWMON
+ bool "Hardware monitoring support"
+ depends on HWMON=y || HWMON=THERMAL
+ help
+ The generic thermal sysfs driver's hardware monitoring support
+ requires a 3.0 or later lm_sensors userspace.
+
+ Say Y if you have a new enough lm_sensors userspace.
diff --git a/drivers/thermal/thermal_sys.c b/drivers/thermal/thermal_sys.c
index 6098787..fe07462 100644
--- a/drivers/thermal/thermal_sys.c
+++ b/drivers/thermal/thermal_sys.c
@@ -295,8 +295,8 @@ thermal_cooling_device_trip_point_show(struct device *dev,

/* Device management */

-#if defined(CONFIG_HWMON) || \
- (defined(CONFIG_HWMON_MODULE) && defined(CONFIG_THERMAL_MODULE))
+#if defined(CONFIG_THERMAL_HWMON)
+
/* hwmon sys I/F */
#include <linux/hwmon.h>
static LIST_HEAD(thermal_hwmon_list);


Attachments:
thermal-hwmon.diff (1.10 kB)

2008-06-22 15:43:58

by Rene Herman

[permalink] [raw]
Subject: Re: [lm-sensors] [REGRESSION, ABI] Re: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

On 22-06-08 16:29, Hans de Goede wrote:

Oh, and...

> I wonder what just plain "sensors" (without the -s) does.

Same thing as sensors -s:

rene@7ixe4:~$ sensors

Can't access procfs/sysfs file
Kernel interface access error
For 2.6 kernels, make sure you have mounted sysfs and libsensors
was compiled with sysfs support!

And no, the values are not being set by sensors -s, it's not just
complaining but still working.

Rene.

2008-06-22 18:02:19

by Hans de Goede

[permalink] [raw]
Subject: Re: [lm-sensors] [REGRESSION, ABI] Re: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

Rene Herman wrote:
> On 22-06-08 16:29, Hans de Goede wrote:
>
>> Rene Herman wrote:
>>> This is an ABI breakage issue and an unfortunate one at that:
>>>
>>
>> No it is not, in 2.6.26rcX, the acpi thermalzones have grown a hwmon
>> interface, that is they register a hwmon device so that "sensors" and
>
> [ ... ]
>
> Now what? Yes it is. 2.6.25.7 works and 2.6.26-rcX with the same config
> options and the same userspace does not.

Know what? No it isn't. Just because some random userspace apps breaks because
certain _assumptions_ no longer hold true, does not make something an ABI breakage.

I agree with you that the results are still no good though.

Know something else? I've just stopped caring about this issue, I'm not the
author of the changes causing said breakage. I'm merely an lm_sensors (both
userspace and kernel space) developer who was heavily involved in getting this
fixed for lm_sensors-3.0.2, and I believe that adding yet another kconfig
option which we then carry for years and years is _not_ a good solution. Some
userspace utlities like udev sit very close to the kernel and sometimes an
kernel update mandates a new udev. To me this is much the same.

But at the end of the day, I do not feel responsible for this as I'm not the
author of the code causing the breakage. I'm just someone who knows the ins and
outs and tried to help, but given the treatment and thanks I've been getting
for my help I'm stopping with helping now.

Regards,

Hans

2008-06-22 18:25:32

by Rene Herman

[permalink] [raw]
Subject: Re: [lm-sensors] [REGRESSION, ABI] Re: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

On 22-06-08 20:07, Hans de Goede wrote:

> Rene Herman wrote:
>> On 22-06-08 16:29, Hans de Goede wrote:
>>
>>> Rene Herman wrote:
>>>> This is an ABI breakage issue and an unfortunate one at that:
>>>>
>>>
>>> No it is not, in 2.6.26rcX, the acpi thermalzones have grown a hwmon
>>> interface, that is they register a hwmon device so that "sensors" and
>>
>> [ ... ]
>>
>> Now what? Yes it is. 2.6.25.7 works and 2.6.26-rcX with the same
>> config options and the same userspace does not.
>
> Know what? No it isn't. Just because some random userspace apps breaks
> because certain _assumptions_ no longer hold true, does not make
> something an ABI breakage.
>
> I agree with you that the results are still no good though.
>
> Know something else? I've just stopped caring about this issue, I'm not
> the author of the changes causing said breakage. I'm merely an
> lm_sensors (both userspace and kernel space) developer who was heavily
> involved in getting this fixed for lm_sensors-3.0.2, and I believe that
> adding yet another kconfig option which we then carry for years and
> years is _not_ a good solution. Some userspace utlities like udev sit
> very close to the kernel and sometimes an kernel update mandates a new
> udev. To me this is much the same.
>
> But at the end of the day, I do not feel responsible for this as I'm not
> the author of the code causing the breakage. I'm just someone who knows
> the ins and outs and tried to help, but given the treatment and thanks
> I've been getting for my help I'm stopping with helping now.

What on earth are you talking about? Could you please re-read? I didn't
"treat badly" you, hwmon, acpi or whatever.

I'm simply pointing out the problem that 2.6.26 is going to break all
setups using lm_sensors 2.0 (which among many, many others includes
every single slackware and derivative system on the planet).

We are not having a flamewar. If you think that every disagreement or
pointing out of a problem constitutes as much, that in itself is a
problem but it's not mine. I reported the problem and then posted a
patch that solves it one particular way.

Another way to solve it _could_ be to just make up a device link if
something generic is available so that sensors doesn't trip over it in
the first place but I don't know if that's a good option. You might.

I haven't a clue what you're talking about. Treatment? What treatment? I
just want to get the above mentioned problem fixed and didn't suggest
anything else. Let's get the problem fixed.

Rene.

2008-06-22 21:58:32

by Rene Herman

[permalink] [raw]
Subject: Re: [lm-sensors] [REGRESSION, ABI] Re: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

>From b51a5f4105f6b5ff52a2819a96dabe2ca3116128 Mon Sep 17 00:00:00 2001
From: Rene Herman <[email protected]>
Date: Sun, 22 Jun 2008 23:45:04 +0200
Subject: [PATCH] thermal: link the virtual/thermal/thermal_zoneN device for hwmon

2.6.26-rc gained a hwmon interface to the ACPI Thermal Zone
driver which unfortunately breaks lm-sensors 2 userspace and
renders all other (subsequent) hwmon sensors inoperable also.

Many systems, current slackware systems among them, are still
using lm-sensors 2 and would be affected.

The problem is userspace tripping over an absent "device"
link in the ACPI Thermal Zone sysfs /sys/class/hwmon/hwmonN
directory. This just adds the virtual "thermal/thermal_zoneN"
device as a device backlink which satisfies lm-sensors again.

Tested and looks fairly obvious, but this wants comment from
a thermal_zone person, to confirm/deny that this is a proper
device pointer to use here for one.

Signed-off-by: Rene Herman <[email protected]>
CC: Hans de Goede <[email protected]>
CC: Zhang Rui <[email protected]>
CC: Mark M. Hoffman <[email protected]>
CC: Jean Delvare <[email protected]>
CC: [email protected]
CC: [email protected]
CC; [email protected]
---
drivers/thermal/thermal_sys.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/thermal/thermal_sys.c b/drivers/thermal/thermal_sys.c
index 6098787..c21e03c 100644
--- a/drivers/thermal/thermal_sys.c
+++ b/drivers/thermal/thermal_sys.c
@@ -357,7 +357,7 @@ thermal_add_hwmon_sysfs(struct thermal_zone_device *tz)

INIT_LIST_HEAD(&hwmon->tz_list);
strlcpy(hwmon->type, tz->type, THERMAL_NAME_LENGTH);
- hwmon->device = hwmon_device_register(NULL);
+ hwmon->device = hwmon_device_register(&tz->device);
if (IS_ERR(hwmon->device)) {
result = PTR_ERR(hwmon->device);
goto free_mem;
--
1.5.5


Attachments:
0001-thermal-link-the-virtual-thermal-thermal_zoneN-devi.patch (1.83 kB)

2008-06-23 01:44:20

by Zhang, Rui

[permalink] [raw]
Subject: Re: [lm-sensors] [REGRESSION, ABI] Re: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors


On Mon, 2008-06-23 at 05:58 +0800, Rene Herman wrote:
> On 22-06-08 20:25, Rene Herman wrote:
>
> > Another way to solve it _could_ be to just make up a device link if
> > something generic is available so that sensors doesn't trip over it
> > in the first place but I don't know if that's a good option. You
> > might.
>
> This also works for me and, if correct, is ofcourse better than the
> CONFIG option. Wants a comment from the thermal_zone side (for which
> Zhang Rui seems the correct CC?) though.
Hi, Rene and Hans,

Thank you for your efforts on this issue and sorry for the late
response, I did not check my email during the whole weekend.

About the hwmon ABI, after the device symbol link is created, are there
any other ABIs required in the device node?
If no, this patch seems to work, although it might break if the first
registered ACPI thermal zone device is unregistered, which ONLY happens
theoretically.

thanks,
rui

>
> plain text document attachment
> (0001-thermal-link-the-virtual-thermal-thermal_zoneN-devi.patch),
> "0001-thermal-link-the-virtual-thermal-thermal_zoneN-devi.patch"
> From b51a5f4105f6b5ff52a2819a96dabe2ca3116128 Mon Sep 17 00:00:00 2001
> From: Rene Herman <[email protected]>
> Date: Sun, 22 Jun 2008 23:45:04 +0200
> Subject: [PATCH] thermal: link the virtual/thermal/thermal_zoneN device for hwmon
>
> 2.6.26-rc gained a hwmon interface to the ACPI Thermal Zone
> driver which unfortunately breaks lm-sensors 2 userspace and
> renders all other (subsequent) hwmon sensors inoperable also.
>
> Many systems, current slackware systems among them, are still
> using lm-sensors 2 and would be affected.
>
> The problem is userspace tripping over an absent "device"
> link in the ACPI Thermal Zone sysfs /sys/class/hwmon/hwmonN
> directory. This just adds the virtual "thermal/thermal_zoneN"
> device as a device backlink which satisfies lm-sensors again.
>
> Tested and looks fairly obvious, but this wants comment from
> a thermal_zone person, to confirm/deny that this is a proper
> device pointer to use here for one.
>
> Signed-off-by: Rene Herman <[email protected]>
> CC: Hans de Goede <[email protected]>
> CC: Zhang Rui <[email protected]>
> CC: Mark M. Hoffman <[email protected]>
> CC: Jean Delvare <[email protected]>
> CC: [email protected]
> CC: [email protected]
> CC; [email protected]
> ---
> drivers/thermal/thermal_sys.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/thermal/thermal_sys.c b/drivers/thermal/thermal_sys.c
> index 6098787..c21e03c 100644
> --- a/drivers/thermal/thermal_sys.c
> +++ b/drivers/thermal/thermal_sys.c
> @@ -357,7 +357,7 @@ thermal_add_hwmon_sysfs(struct thermal_zone_device *tz)
>
> INIT_LIST_HEAD(&hwmon->tz_list);
> strlcpy(hwmon->type, tz->type, THERMAL_NAME_LENGTH);
> - hwmon->device = hwmon_device_register(NULL);
> + hwmon->device = hwmon_device_register(&tz->device);
> if (IS_ERR(hwmon->device)) {
> result = PTR_ERR(hwmon->device);
> goto free_mem;

2008-06-23 05:15:39

by Hans de Goede

[permalink] [raw]
Subject: Re: [lm-sensors] [REGRESSION, ABI] Re: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

Zhang Rui wrote:
> On Mon, 2008-06-23 at 05:58 +0800, Rene Herman wrote:
>> On 22-06-08 20:25, Rene Herman wrote:
>>
>>> Another way to solve it _could_ be to just make up a device link if
>>> something generic is available so that sensors doesn't trip over it
>>> in the first place but I don't know if that's a good option. You
>>> might.
>> This also works for me and, if correct, is ofcourse better than the
>> CONFIG option. Wants a comment from the thermal_zone side (for which
>> Zhang Rui seems the correct CC?) though.
> Hi, Rene and Hans,
>
> Thank you for your efforts on this issue and sorry for the late
> response, I did not check my email during the whole weekend.
>
> About the hwmon ABI, after the device symbol link is created, are there
> any other ABIs required in the device node?

Not that I know of, Jean ?

Regards,

Hans

2008-06-23 10:09:15

by Jean Delvare

[permalink] [raw]
Subject: Re: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

Hi Hans,

On Sun, 22 Jun 2008 16:29:50 +0200, Hans de Goede wrote:
> Rene Herman wrote:
> > This is an ABI breakage issue and an unfortunate one at that:
>
> No it is not, (...)

Technically speaking, Hans is right. The problem doesn't qualify as an
"ABI breakage". Just because a kernel update broke user-space, doesn't
imply API breakage.

That being said, the truth is that we don't really care how the
breakage is called. It's broken and needs to be fixed, that's all we
care about.

> (...)
> Actually its lm_sensors userspace which is to blame, as instead of skipping the
> new hwmon device which it doesn't understand it aborts (atleast that is what I
> understand from what you're telling us).

Yes, user-space is to blame.

> > (...)
> > So it seems we have here a change to the kernel requiring a userspace
> > basically noone is ready for
>
> Erm if you look at that same page you will notice there are links to patches
> for almost every userspace package which uses lm_sensors, I know as I wrote
> most of them, quite a few of them have been integrated by their resp. upstreams.

Can you please update the wiki to say so? I was myself wondering
whether your patches had been integrated or not by now.

Maybe we could present things a bit better, for example with a table or
a bullet-list, listing all applications, with the minimum version that
has libsensors 3 support (for applications which are already updated
upstream) or a link to the patch (for those which are not.) This could
stay on the Download page, or be moved to a dedicated page. If you have
some time to do something like this, please do. If not, please tell me,
I'll take care of the reformatting and then you can update the entries.

> Also "noone is ready for"? lm_sensors-3.0.x is the default in both Fedora 9 and
> OpenSuse 11.

Note: openSuse 11.0 ships with both libsensors 2.10.6 and libsensors
3.0.2. sensors links with the latter, but other applications presumably
link with the former. This version of openSuse is the transition one.
The plan is to only have libsensors 3.x in openSuse 11.1.

> > But if not, .26 is around the corner and requiring libsensors-3.0 must
> > really not be.
>
> I agree that requiring libsensors-3.0.2 for this is not a good solution, but I
> don't want to be crippling the kernel for what I believe is a bug in 2.10.x either.

Actually, requiring libsensors 3.0.2 wouldn't be such a bad solution.
libsensors 2.10.x lacks support for many new devices. It also needs
libsysfs, which most distributions are trying to get rid of.

You can't at the same time upgrade to the latest kernel and expect
legacy branches of all user-space tools to keep working. Legacy means,
well, legacy. Mixing that with bleeding edge is asking for trouble.

> (...)
> 2) Fix this in the 2.10.x series (which are still supported) and do a new
> 2.10.x release.

In fact, it is already fixed. If you look carefully at:
http://www.lm-sensors.org/wiki/Download

You'll see that I maintain a list of recommended patches for the last
version of each branch of lm-sensors. For version 2.10.6, there are
currently 3 recommended patches, the first one being:
http://www.lm-sensors.org/changeset/5147

Which, you guess, fixes the problem reported by Rene.

So all we have to do is release lm-sensors 2.10.7 soon.

> 3) If this really completely messes 2.10.x and in that case I'm afraid we will
> have to make kernel side changes.

No, the kernel does the right thing and does not need to be modified at
all.

--
Jean Delvare

2008-06-23 10:23:37

by Rene Herman

[permalink] [raw]
Subject: Re: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

On 23-06-08 12:08, Jean Delvare wrote:

> No, the kernel does the right thing and does not need to be modified
> at all.

No Jean, this is totally unacceptable. No matter how you want to call
things, 2.6.26 is going to break important functionality on millions of
systems and you simply do not get to do that. Can you comment on the
last patch posted? It's trivial:

http://lkml.org/lkml/2008/6/22/243

Rene.

2008-06-23 10:40:31

by Rene Herman

[permalink] [raw]
Subject: Re: [lm-sensors] [REGRESSION, ABI] Re: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

On 23-06-08 03:44, Zhang Rui wrote:

> On Mon, 2008-06-23 at 05:58 +0800, Rene Herman wrote:
>> On 22-06-08 20:25, Rene Herman wrote:
>>
>>> Another way to solve it _could_ be to just make up a device link if
>>> something generic is available so that sensors doesn't trip over it
>>> in the first place but I don't know if that's a good option. You
>>> might.
>> This also works for me and, if correct, is ofcourse better than the
>> CONFIG option. Wants a comment from the thermal_zone side (for which
>> Zhang Rui seems the correct CC?) though.
> Hi, Rene and Hans,
>
> Thank you for your efforts on this issue and sorry for the late
> response, I did not check my email during the whole weekend.
>
> About the hwmon ABI, after the device symbol link is created, are there
> any other ABIs required in the device node?

Doesn't seem to. libsensors-2.10.6 is happy with this at least.

> If no, this patch seems to work, although it might break if the first
> registered ACPI thermal zone device is unregistered, which ONLY happens
> theoretically.

Mmm. Because more thermal zones may share one hwmon interface I gather?
Do you feel this is an okay minimal fix for 2.6.26 or is there something
else trivial enough available? We're late in the -rc stage...

>> From: Rene Herman <[email protected]>
>> Date: Sun, 22 Jun 2008 23:45:04 +0200
>> Subject: [PATCH] thermal: link the virtual/thermal/thermal_zoneN device for hwmon
>>
>> 2.6.26-rc gained a hwmon interface to the ACPI Thermal Zone
>> driver which unfortunately breaks lm-sensors 2 userspace and
>> renders all other (subsequent) hwmon sensors inoperable also.
>>
>> Many systems, current slackware systems among them, are still
>> using lm-sensors 2 and would be affected.
>>
>> The problem is userspace tripping over an absent "device"
>> link in the ACPI Thermal Zone sysfs /sys/class/hwmon/hwmonN
>> directory. This just adds the virtual "thermal/thermal_zoneN"
>> device as a device backlink which satisfies lm-sensors again.
>>
>> Tested and looks fairly obvious, but this wants comment from
>> a thermal_zone person, to confirm/deny that this is a proper
>> device pointer to use here for one.
>>
>> Signed-off-by: Rene Herman <[email protected]>
>> CC: Hans de Goede <[email protected]>
>> CC: Zhang Rui <[email protected]>
>> CC: Mark M. Hoffman <[email protected]>
>> CC: Jean Delvare <[email protected]>
>> CC: [email protected]
>> CC: [email protected]
>> CC; [email protected]
>> ---
>> drivers/thermal/thermal_sys.c | 2 +-
>> 1 files changed, 1 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/thermal/thermal_sys.c b/drivers/thermal/thermal_sys.c
>> index 6098787..c21e03c 100644
>> --- a/drivers/thermal/thermal_sys.c
>> +++ b/drivers/thermal/thermal_sys.c
>> @@ -357,7 +357,7 @@ thermal_add_hwmon_sysfs(struct thermal_zone_device *tz)
>>
>> INIT_LIST_HEAD(&hwmon->tz_list);
>> strlcpy(hwmon->type, tz->type, THERMAL_NAME_LENGTH);
>> - hwmon->device = hwmon_device_register(NULL);
>> + hwmon->device = hwmon_device_register(&tz->device);
>> if (IS_ERR(hwmon->device)) {
>> result = PTR_ERR(hwmon->device);
>> goto free_mem;
>
>

2008-06-23 10:56:57

by Jean Delvare

[permalink] [raw]
Subject: Re: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

Hi Rene,

On Sun, 22 Jun 2008 20:25:47 +0200, Rene Herman wrote:
> On 22-06-08 20:07, Hans de Goede wrote:
> > Know something else? I've just stopped caring about this issue, I'm not
> > the author of the changes causing said breakage. I'm merely an
> > lm_sensors (both userspace and kernel space) developer who was heavily
> > involved in getting this fixed for lm_sensors-3.0.2, and I believe that
> > adding yet another kconfig option which we then carry for years and
> > years is _not_ a good solution. Some userspace utlities like udev sit
> > very close to the kernel and sometimes an kernel update mandates a new
> > udev. To me this is much the same.
> >
> > But at the end of the day, I do not feel responsible for this as I'm not
> > the author of the code causing the breakage. I'm just someone who knows
> > the ins and outs and tried to help, but given the treatment and thanks
> > I've been getting for my help I'm stopping with helping now.
>
> What on earth are you talking about? Could you please re-read? I didn't
> "treat badly" you, hwmon, acpi or whatever.

You certainly did. I felt the same as Hans when reading the discussion
thread.

Now, you are lucky that I already know you and I know that you are
usually helpful and cooperative, so I know that you probably didn't
mean the aggressive tone you used. But presumably Hans doesn't know
you, so his reaction was understandable.

Hans is one of the main contributors to the hwmon subsystem and the
lm-sensors project in general. I would be grateful if the users he
tries to help could treat him with the respect he reserves. If he was
to leave the project, this would be a significant loss, and this is
something I really want to avoid.

Thanks,
--
Jean Delvare

2008-06-23 11:06:22

by Jean Delvare

[permalink] [raw]
Subject: Re: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

Hi Rui,

On Mon, 23 Jun 2008 09:44:42 +0800, Zhang Rui wrote:
> On Mon, 2008-06-23 at 05:58 +0800, Rene Herman wrote:
> > On 22-06-08 20:25, Rene Herman wrote:
> >
> > > Another way to solve it _could_ be to just make up a device link if
> > > something generic is available so that sensors doesn't trip over it
> > > in the first place but I don't know if that's a good option. You
> > > might.
> >
> > This also works for me and, if correct, is ofcourse better than the
> > CONFIG option. Wants a comment from the thermal_zone side (for which
> > Zhang Rui seems the correct CC?) though.
> Hi, Rene and Hans,
>
> Thank you for your efforts on this issue and sorry for the late
> response, I did not check my email during the whole weekend.
>
> About the hwmon ABI, after the device symbol link is created, are there
> any other ABIs required in the device node?

Not that I know of. The code currently in the kernel works just fine
with both lm-sensors SVN trunk (which will become 2.10.7 soon) and
3.0.2 as far as I can see.

> If no, this patch seems to work, although it might break if the first
> registered ACPI thermal zone device is unregistered, which ONLY happens
> theoretically.

If I remember correctly, there are more than one device in a given
thermal zone, so having a link pointing to one of them would create an
asymmetry. This basically means that nobody can use the device link in
question, so there's no point in creating it from a functional point of
view. If the goal is simply to solve the breakage that was reported by
Rene, the bug is in libsensors, so that' where it must be fixed (and
actually it already is.)

Thanks,
--
Jean Delvare

2008-06-23 11:57:27

by Jean Delvare

[permalink] [raw]
Subject: Re: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

Hi Rene,

On Mon, 23 Jun 2008 12:24:06 +0200, Rene Herman wrote:
> On 23-06-08 12:08, Jean Delvare wrote:
> > No, the kernel does the right thing and does not need to be modified
> > at all.
>
> No Jean, this is totally unacceptable. No matter how you want to call
> things, 2.6.26 is going to break important functionality on millions of
> systems and you simply do not get to do that.

No, it's not going to be the end of the world that you predict. Please
stop being alarmist, it really doesn't help.

We are going to break hardware monitoring for users who upgrade to
kernel 2.6.26 by themselves and have enabled option "THERMAL" and are
using lm-sensors <= 2.10.6. I suspect this is a relatively small number
of users, and these are also the ones who are presumably skilled enough
to go to http://www.lm-sensors.org/, find the patch they need, and
apply it to libsensors themselves.

We are not going to break any system using a distribution kernel
because distributions test their kernel at least to some extent before
they release it, and that kind of breakage can't go unnoticed. So,
distributions which haven't completely switched to lm-sensors 3.x yet,
will see the breakage and patch their libsensors 2.10.6 to fix it. For
what it's worth, the patch in question is in openSuse since March 17th.

For distributions who have good maintainers, there should never be any
problem anyway. We maintain and publish a list of recommended patches.
A distribution with all these patches applied should avoid all known
problems, compatibility or otherwise.

Please also realize that I personally keep the maintainers of the
Fedora, openSuse and Debian lm-sensors packages informed when I update
the list of recommended patches. If I should forget to do so and they
hit a problem, they know where to find me.

> Can you comment on the last patch posted? It's trivial:
>
> http://lkml.org/lkml/2008/6/22/243

It's trivial and wrong, so thanks but no thanks. The bug is in
libsensors, we fix it in libsensors.

--
Jean Delvare

2008-06-23 12:34:35

by Rene Herman

[permalink] [raw]
Subject: Re: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

On 23-06-08 13:57, Jean Delvare wrote:

> On Mon, 23 Jun 2008 12:24:06 +0200, Rene Herman wrote:

>> No Jean, this is totally unacceptable. No matter how you want to call
>> things, 2.6.26 is going to break important functionality on millions of
>> systems and you simply do not get to do that.
>
> No, it's not going to be the end of the world that you predict. Please
> stop being alarmist, it really doesn't help.
>
> We are going to break hardware monitoring for users who upgrade to
> kernel 2.6.26 by themselves and have enabled option "THERMAL"

Which is an option selected by ACPI_THERMAL, for which I quoted the help
text earlier. Basically everyone with ACPI enabled will have it enabled.
Rather wide-spread, that is.

> and are using lm-sensors <= 2.10.6. I suspect this is a relatively
> small number of users

Yes, right. All not completely new systems, all completely new slackware
and derived systems... "relatively" is a word very much needed here.

I really cannot believe you guys are actually arguing this. It seems
that me being tired and short pulled this in to senseless country but
can we please concentrate on the issue?

libsensors dictated the ABI rule that the hwmon directories must have
device backlinks; the new ACPI Thermal Zone hwmon interface breaks that
bit of ABI. It is not relevant that that ABI may have gotten to be as a
result of unfortunate programming on the userspace side -- the only
thing relevant is that it IS. lm-sensors 2 is on millions of systems out
there. This is not meant agressively, or whatever you guys seem to want
to read in my words, it's un undeniable fact.

> and these are also the ones who are presumably skilled enough to go
> to http://www.lm-sensors.org/, find the patch they need, and apply it
> to libsensors themselves.

At times there can obviously be situations where it's fine to require
new userspace but in this case we have a new userspace which hasn't even
been released yet, we have a ton of _different_ userspace depending on
that bit of core userspace, we have breakage of the important kind (as
you no doubt know, sensors can be pretty vital, although admittedly it's
not silent breakage at least) and we have an opportunity to just say
"okay, we'll apply a 1 line patch and be done with it" which avoids any
and all problems. Why are you arguing this?

>> Can you comment on the last patch posted? It's trivial:
>>
>> http://lkml.org/lkml/2008/6/22/243
>
> It's trivial and wrong, so thanks but no thanks. The bug is in
> libsensors, we fix it in libsensors.

This cannot be the reason, because it's not wrong. We just need a device
backlink. Basically, any single one will do. It's just about keeping
lm-sensors 2 happy.

Rene.

2008-06-23 13:47:38

by Jean Delvare

[permalink] [raw]
Subject: Re: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

Hi Rene,

FYI: this is my last reply to you as far as this thread is concerned. I
have work to do, and this problem is already solved.

On Mon, 23 Jun 2008 14:35:03 +0200, Rene Herman wrote:
> On 23-06-08 13:57, Jean Delvare wrote:
>
> > On Mon, 23 Jun 2008 12:24:06 +0200, Rene Herman wrote:
>
> >> No Jean, this is totally unacceptable. No matter how you want to call
> >> things, 2.6.26 is going to break important functionality on millions of
> >> systems and you simply do not get to do that.
> >
> > No, it's not going to be the end of the world that you predict. Please
> > stop being alarmist, it really doesn't help.
> >
> > We are going to break hardware monitoring for users who upgrade to
> > kernel 2.6.26 by themselves and have enabled option "THERMAL"
>
> Which is an option selected by ACPI_THERMAL, for which I quoted the help
> text earlier. Basically everyone with ACPI enabled will have it enabled.
> Rather wide-spread, that is.

Correct, I had forgotten that it was enabled by ACPI_THERMAL. So indeed
pretty much everyone using lm-sensors will have it.

> > and are using lm-sensors <= 2.10.6. I suspect this is a relatively
> > small number of users
>
> Yes, right. All not completely new systems, all completely new slackware
> and derived systems... "relatively" is a word very much needed here.

Slackware 12.1 shipped with a 2.6.25 kernel and this isn't going to
change. So I am absolutely not worried about Slackware users in
general. The only users who will hit the breakage are the ones
upgrading their kernel themselves, and that's regardless of the
distribution.

> I really cannot believe you guys are actually arguing this. It seems
> that me being tired and short pulled this in to senseless country but
> can we please concentrate on the issue?

When you have the feeling that everybody else has gone crazy and you're
the only sane person on-board... you know what it really means? ;)

> libsensors dictated the ABI rule that the hwmon directories must have
> device backlinks; the new ACPI Thermal Zone hwmon interface breaks that
> bit of ABI. It is not relevant that that ABI may have gotten to be as a
> result of unfortunate programming on the userspace side -- the only
> thing relevant is that it IS. lm-sensors 2 is on millions of systems out
> there.

We don't care about how many systems use lm-sensors 2. We only care
about how many of these will upgrade to kernel 2.6.26 before they
upgrade to lm-sensors 2.10.7 or patch their lm-sensors 2.10.6. My take
is that these aren't that many people.

Seriously, the kind of "ABI breakage", as you insist on calling it,
happens all the time. Just looking at sensors and only counting the
very big changes, sensors were exposed in /proc in 2.4 kernels and are
now exposed in /sys in 2.6 kernels, and all hardware monitoring devices
were i2c devices to the kernel until kernel 2.6.13 and this is no
longer the case. Just try going a couple lm-sensors versions back while
still running your 2.6.25 kernel and you'll see it won't take long
before at least a specific case breaks.

And I'm fairly certain that we (the lm-sensors group) aren't specially
bad at that. Every other subsystem that evolves quickly must have the
same problems. That's exactly the reason why we have user-space
libraries interfacing with the kernel. When the kernel interfaces
evolve, we update the libraries to take the changes into account. It is
usually so smooth that you don't see it. This time it's a bit less
smooth because we've been too slow releasing 2.10.7. You can't get it
perfect all the time.

I'm really sorry that we don't live in an ideal world where everything
is compatible with everything forever.

> This is not meant agressively, or whatever you guys seem to want
> to read in my words, it's un undeniable fact.
>
> > and these are also the ones who are presumably skilled enough to go
> > to http://www.lm-sensors.org/, find the patch they need, and apply it
> > to libsensors themselves.
>
> At times there can obviously be situations where it's fine to require
> new userspace but in this case we have a new userspace which hasn't even
> been released yet, we have a ton of _different_ userspace depending on
> that bit of core userspace, we have breakage of the important kind (as
> you no doubt know, sensors can be pretty vital, although admittedly it's
> not silent breakage at least) and we have an opportunity to just say
> "okay, we'll apply a 1 line patch and be done with it" which avoids any
> and all problems. Why are you arguing this?

I'm arguing this because you're trying to frighten everyone with
"kernel ABI breakage" and "millions of users" when all we have is an
unfortunate bug in libsensors, which is already fixed and this fix is
only waiting to be released. Your 1 line patch, as small as it is, is
ugly and could have unexpected side effects.

> >> Can you comment on the last patch posted? It's trivial:
> >>
> >> http://lkml.org/lkml/2008/6/22/243
> >
> > It's trivial and wrong, so thanks but no thanks. The bug is in
> > libsensors, we fix it in libsensors.
>
> This cannot be the reason, because it's not wrong. We just need a device
> backlink. Basically, any single one will do. It's just about keeping
> lm-sensors 2 happy.

Your patch _is_ wrong. You make the device link point to _one_ of the
devices that belong to the thermal zone, arbitrarily. If the device is
removed before the thermal zone itself is, what happens? Breakage. If
we were to apply your patch just to smooth the transition with the
2.6.26 kernel, then we'd have to rework the code in question later
because it's not correct. And by your own terms, this link would become
part of the ABI, meaning that we could not get rid of it later or even
change it. Who knows? Anyone could have decided to use the link for
whatever purpose.

On top of that, libsensors 2 is going to do something with the device
link you created. I'm curious how you can guarantee that there won't be
any side effect, given that the devices in question were never meant to
be processed by libsensors.

You are looking at the problem you've hit and you are trying to solve
it, and this is great. But in the end you have to admit that you don't
really know the code you're touching. You did not read the whole
thermal zone code to understand how it was working. You did not read
the whole libsensors 2.10.x code to know for sure what will happen when
it follows the device link you want to add. I guess you also did not
read the whole libsensors 3.0.x code to make sure that adding this link
would have no unexpected side effect. So it's about time that you trust
the guys who wrote the code and are maintaining it for years when they
say that they have fixed the problem in the best possible way.

Thanks,
--
Jean Delvare

2008-06-23 14:05:48

by Rene Herman

[permalink] [raw]
Subject: Re: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

On 23-06-08 15:47, Jean Delvare wrote:

>> This cannot be the reason, because it's not wrong. We just need a device
>> backlink. Basically, any single one will do. It's just about keeping
>> lm-sensors 2 happy.
>
> Your patch _is_ wrong. You make the device link point to _one_ of the
> devices that belong to the thermal zone, arbitrarily. If the device is
> removed before the thermal zone itself is, what happens?

As Zhang Rui said, this cannot happen in reality. I'll stop talking to
you. Kernel side, it's not your problem anyway, it's an ACPI Thermal
Zone one. Guess I'll go ask the 2.6.26 release manager if he feels that
breaking existing lm-sensors 2 userspace systems is acceptable. I myself
obviously know how to fix things by now.

Admittedly I need to find another hobby because the regularity with
which people on this list piss me off is definitely disturbing. I also
have this depressing notion that it might just be people pissing me off
with frightening regularity period, but oh well.

Gardening... maybe I'll do gardening now.

Rene.

2008-06-23 14:08:53

by Hans de Goede

[permalink] [raw]
Subject: Re: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

Jean Delvare wrote:
> Please also realize that I personally keep the maintainers of the
> Fedora, openSuse and Debian lm-sensors packages informed when I update
> the list of recommended patches. If I should forget to do so and they
> hit a problem, they know where to find me.
>

To elaborate on that let me add that I am the Fedora lm_sensors maintainer and
as such that I'm very much aware of this problem.

Regards,


Hans

2008-06-23 14:31:30

by Matthew Garrett

[permalink] [raw]
Subject: Re: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

On Mon, Jun 23, 2008 at 02:35:03PM +0200, Rene Herman wrote:
> libsensors dictated the ABI rule that the hwmon directories must have
> device backlinks; the new ACPI Thermal Zone hwmon interface breaks that
> bit of ABI. It is not relevant that that ABI may have gotten to be as a
> result of unfortunate programming on the userspace side -- the only
> thing relevant is that it IS. lm-sensors 2 is on millions of systems out
> there. This is not meant agressively, or whatever you guys seem to want
> to read in my words, it's un undeniable fact.

No, libsensors made an assumption about the ABI that turns out not to be
true. The ABI hasn't changed, libsensors is just being exposed to a case
it didn't previously see.

We've had this kind of change before. The ACPI backlight code changed in
such a way that scripts that blindly wrote values instead of (correctly)
reading the maximum brightness value broke. mmap's behaviour changed in
such a way that it was no longer possible for vm86 to execute code that
wasn't mapped as executable, breaking libx86. The applications in
question were undeniably buggy. Those are examples that I was personally
involved with - I'm sure there are others. Where userspace has made
false assumptions, it's not the kernel's responsibility to continue to
support those assumptions.

--
Matthew Garrett | [email protected]

2008-06-23 17:09:39

by Rene Herman

[permalink] [raw]
Subject: Re: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

On 23-06-08 16:31, Matthew Garrett wrote:

> On Mon, Jun 23, 2008 at 02:35:03PM +0200, Rene Herman wrote:
>> libsensors dictated the ABI rule that the hwmon directories must have
>> device backlinks; the new ACPI Thermal Zone hwmon interface breaks that
>> bit of ABI. It is not relevant that that ABI may have gotten to be as a
>> result of unfortunate programming on the userspace side -- the only
>> thing relevant is that it IS. lm-sensors 2 is on millions of systems out
>> there. This is not meant agressively, or whatever you guys seem to want
>> to read in my words, it's un undeniable fact.
>
> No, libsensors made an assumption about the ABI that turns out not to be
> true. The ABI hasn't changed, libsensors is just being exposed to a case
> it didn't previously see.
>
> We've had this kind of change before. The ACPI backlight code changed in
> such a way that scripts that blindly wrote values instead of (correctly)
> reading the maximum brightness value broke. mmap's behaviour changed in
> such a way that it was no longer possible for vm86 to execute code that
> wasn't mapped as executable, breaking libx86. The applications in
> question were undeniably buggy. Those are examples that I was personally
> involved with - I'm sure there are others. Where userspace has made
> false assumptions, it's not the kernel's responsibility to continue to
> support those assumptions.

We are not going to agree. In this, it's not a random application, but the
one and only interface to sensors that's in use that breaks. It is all of
sensors support that breaks, all user interfaces, as they all depend on the
one libsensors. Sure, if some random application makes bad assumptions the
remedy is fixing the random application. If the one and only interface to
something breaks, it's the ABI that breaks.

And if people really insist on calling it FNOOZLEGLUM breakage instead of
ABI breakage, all for it. I love exciting words. Its just that I'm really
more interested in the "breakage" bit than anyone else in this thread it
seems.

Rene.

2008-06-23 17:55:38

by Len Brown

[permalink] [raw]
Subject: Re: [lm-sensors] [REGRESSION, ABI] Re: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

Rene,
Thank you for reporting this.

I agree that this failure is an unwelcome surprise to those users
who upgrade to 2.6.26 but are still using libsensors <= 2.10.6.

Jean, Mark, Hans,

I'm actually fine with adding a temporary kernel config option
along the lines Rene suggested to ease the migration
to linux-2.6.26 for those users.

But the config option would need to be scheduled for removal
after a certain period (say 6 months) so we don't have to maintain
it forever.

More importantly, I think it would also have to be disabled by default
so that it would not have a negative impact on what we think are the
majority of properly configured systems. After all, we fixed this
bug in user-space about out 4 months ago and as you point out, the
distro upgrade path is actually quite well looked after.

So I'm not sure how useful it would be to the target users.
After they run into the problem, they'd probably google it
and find that they can either tweak a kernel config option
or upgrade libsensors. And we'd prefer that they do the
later rather than the former, yes?

just let me know.

thanks,
-Len

2008-06-23 19:55:05

by Jean Delvare

[permalink] [raw]
Subject: Re: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

Hi Len,

On Mon, 23 Jun 2008 13:54:56 -0400 (EDT), Len Brown wrote:
> Rene,
> Thank you for reporting this.
>
> I agree that this failure is an unwelcome surprise to those users
> who upgrade to 2.6.26 but are still using libsensors <= 2.10.6.
>
> Jean, Mark, Hans,
>
> I'm actually fine with adding a temporary kernel config option
> along the lines Rene suggested to ease the migration
> to linux-2.6.26 for those users.
>
> But the config option would need to be scheduled for removal
> after a certain period (say 6 months) so we don't have to maintain
> it forever.
>
> More importantly, I think it would also have to be disabled by default
> so that it would not have a negative impact on what we think are the
> majority of properly configured systems. After all, we fixed this
> bug in user-space about out 4 months ago and as you point out, the
> distro upgrade path is actually quite well looked after.
>
> So I'm not sure how useful it would be to the target users.
> After they run into the problem, they'd probably google it
> and find that they can either tweak a kernel config option
> or upgrade libsensors. And we'd prefer that they do the
> later rather than the former, yes?

If the option defaults to thermal zone being enabled, and users have to
rebuild their kernel to disable it, then I don't think it has any
value. Patching and rebuilding libsensors is no harder than
reconfiguring and rebuilding the kernel, so we can as well tell the
users to do the former.

If the option defaults to thermal zone being disabled, then it makes
some sense as a way to smooth the transition. If the help text is clear
enough (clearly saying for which versions of lm-sensors users can
enable the option and for which versions they would rather not) it
should work. The drawback being that users in a hurry might stick to
the default without reading the help text, and miss an opportunity to
have the ACPI thermal zones magically integrated into their favorite
monitoring application.

Whether avoiding the risk of easily fixable breakage for some users is
worth the temporary loss of new functionality for other users, I can't
really say. I think I wouldn't do it myself, but I'm not the only one
to decide. If we decide to do it, I have no objection.

Thanks,
--
Jean Delvare

2008-06-23 20:07:18

by Rene Herman

[permalink] [raw]
Subject: Re: [lm-sensors] [REGRESSION, ABI] Re: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

>From 8023fee88cdbbc755a9ae692803e6292a619d56c Mon Sep 17 00:00:00 2001
From: Rene Herman <[email protected]>
Date: Mon, 23 Jun 2008 21:50:25 +0200
Subject: [PATCH] acpi/thermal: allow disabling of thermal zone hwmon support

2.6.26-rc gained a hwmon interface to the Thermal Zone driver
which unfortunately breaks lm-sensors 2 userspace and renders
all other (subsequent) hwmon sensors inoperable also.

Many systems, current slackware and derivative systems among
them, are using lm-sensors 2 and would be affected. A new
lm-sensors 2.10.7 release is needed to fix this from the user
side.

This makes the hwmon support optional (enabled by default) and
comments that it needs a new enough lm-sensors userspace. This
also immediately schedules the option for removal again as this
is only intended to not break systems over the 2.6.26 upgrade
when new enough lm-sensors is not yet available or widely
deployed.

Signed-off-by: Rene Herman <[email protected]>
CC: Len Brown <[email protected]>
CC: Hans de Goede <[email protected]>
CC: Zhang Rui <[email protected]>
CC: Mark M. Hoffman <[email protected]>
CC: Jean Delvare <[email protected]>
CC: [email protected]
CC: [email protected]
CC: [email protected]
---
Documentation/feature-removal-schedule.txt | 9 +++++++++
drivers/thermal/Kconfig | 10 ++++++++++
drivers/thermal/thermal_sys.c | 4 ++--
include/linux/thermal.h | 6 ++----
4 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index 5b3f31f..9e5f86c 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -312,3 +312,12 @@ When: 2.6.26
Why: Implementation became generic; users should now include
linux/semaphore.h instead.
Who: Matthew Wilcox <[email protected]>
+
+---------------------------
+
+What: CONFIG_THERMAL_HWMON
+When: Januaru 2009
+Why: This option was introduced just to allow older lm-sensors userspace
+ to keep working over the upgrade to 2.6.26. At the scheduled time of
+ removal fixed lm-sensors (2.0 or 3.0) should be readily available.
+Who: Rene Herman <[email protected]>
diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
index 4b62852..0b359ab 100644
--- a/drivers/thermal/Kconfig
+++ b/drivers/thermal/Kconfig
@@ -12,3 +12,13 @@ menuconfig THERMAL
cooling devices.
All platforms with ACPI thermal support can use this driver.
If you want this support, you should say Y or M here.
+
+config THERMAL_HWMON
+ bool "Hardware monitoring support"
+ depends on HWMON=y || HWMON=THERMAL
+ default y
+ help
+ The generic thermal sysfs driver's hardware monitoring support
+ requires a 2.10.7/3.0.2 or later lm-sensors userspace.
+
+ Say Y unless you need support for older lm-sensors.
diff --git a/drivers/thermal/thermal_sys.c b/drivers/thermal/thermal_sys.c
index 6098787..fe07462 100644
--- a/drivers/thermal/thermal_sys.c
+++ b/drivers/thermal/thermal_sys.c
@@ -295,8 +295,8 @@ thermal_cooling_device_trip_point_show(struct device *dev,

/* Device management */

-#if defined(CONFIG_HWMON) || \
- (defined(CONFIG_HWMON_MODULE) && defined(CONFIG_THERMAL_MODULE))
+#if defined(CONFIG_THERMAL_HWMON)
+
/* hwmon sys I/F */
#include <linux/hwmon.h>
static LIST_HEAD(thermal_hwmon_list);
diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index 06d3e6e..917707e 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -66,8 +66,7 @@ struct thermal_cooling_device {
((long)t-2732+5)/10 : ((long)t-2732-5)/10)
#define CELSIUS_TO_KELVIN(t) ((t)*10+2732)

-#if defined(CONFIG_HWMON) || \
- (defined(CONFIG_HWMON_MODULE) && defined(CONFIG_THERMAL_MODULE))
+#if defined(CONFIG_THERMAL_HWMON)
/* thermal zone devices with the same type share one hwmon device */
struct thermal_hwmon_device {
char type[THERMAL_NAME_LENGTH];
@@ -94,8 +93,7 @@ struct thermal_zone_device {
struct idr idr;
struct mutex lock; /* protect cooling devices list */
struct list_head node;
-#if defined(CONFIG_HWMON) || \
- (defined(CONFIG_HWMON_MODULE) && defined(CONFIG_THERMAL_MODULE))
+#if defined(CONFIG_THERMAL_HWMON)
struct list_head hwmon_node;
struct thermal_hwmon_device *hwmon;
struct thermal_hwmon_attr temp_input; /* hwmon sys attr */
--
1.5.5


Attachments:
0001-acpi-thermal-allow-disabling-of-thermal-zone-hwmon.patch (4.31 kB)

2008-06-23 20:24:25

by Rene Herman

[permalink] [raw]
Subject: Re: [lm-sensors] [REGRESSION, ABI] Re: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

>From 7b41d17346cdd1f9db1223c024bd9f4604ee5c82 Mon Sep 17 00:00:00 2001
From: Rene Herman <[email protected]>
Date: Mon, 23 Jun 2008 21:50:25 +0200
Subject: [PATCH] acpi/thermal: allow disabling of thermal zone hwmon support

2.6.26-rc gained a hwmon interface to the Thermal Zone driver
which unfortunately breaks lm-sensors 2 userspace and renders
all other (subsequent) hwmon sensors inoperable also.

Many systems, current slackware and derivative systems among
them, are using lm-sensors 2 and would be affected. A new
lm-sensors 2.10.7 release is needed to fix this from the user
side.

This makes the hwmon support optional (enabled by default) and
comments that it needs a new enough lm-sensors userspace. This
also immediately schedules the option for removal again as this
is only intended to not break systems over the 2.6.26 upgrade
when new enough lm-sensors is not yet available or widely
deployed.

Signed-off-by: Rene Herman <[email protected]>
CC: Len Brown <[email protected]>
CC: Hans de Goede <[email protected]>
CC: Zhang Rui <[email protected]>
CC: Mark M. Hoffman <[email protected]>
CC: Jean Delvare <[email protected]>
CC: [email protected]
CC: [email protected]
CC: [email protected]
---
Documentation/feature-removal-schedule.txt | 9 +++++++++
drivers/thermal/Kconfig | 10 ++++++++++
drivers/thermal/thermal_sys.c | 4 ++--
include/linux/thermal.h | 6 ++----
4 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index 5b3f31f..46ece3f 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -312,3 +312,12 @@ When: 2.6.26
Why: Implementation became generic; users should now include
linux/semaphore.h instead.
Who: Matthew Wilcox <[email protected]>
+
+---------------------------
+
+What: CONFIG_THERMAL_HWMON
+When: January 2009
+Why: This option was introduced just to allow older lm-sensors userspace
+ to keep working over the upgrade to 2.6.26. At the scheduled time of
+ removal fixed lm-sensors (2.x or 3.x) should be readily available.
+Who: Rene Herman <[email protected]>
diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
index 4b62852..0b359ab 100644
--- a/drivers/thermal/Kconfig
+++ b/drivers/thermal/Kconfig
@@ -12,3 +12,13 @@ menuconfig THERMAL
cooling devices.
All platforms with ACPI thermal support can use this driver.
If you want this support, you should say Y or M here.
+
+config THERMAL_HWMON
+ bool "Hardware monitoring support"
+ depends on HWMON=y || HWMON=THERMAL
+ default y
+ help
+ The generic thermal sysfs driver's hardware monitoring support
+ requires a 2.10.7/3.0.2 or later lm-sensors userspace.
+
+ Say Y unless you need support for older lm-sensors.
diff --git a/drivers/thermal/thermal_sys.c b/drivers/thermal/thermal_sys.c
index 6098787..fe07462 100644
--- a/drivers/thermal/thermal_sys.c
+++ b/drivers/thermal/thermal_sys.c
@@ -295,8 +295,8 @@ thermal_cooling_device_trip_point_show(struct device *dev,

/* Device management */

-#if defined(CONFIG_HWMON) || \
- (defined(CONFIG_HWMON_MODULE) && defined(CONFIG_THERMAL_MODULE))
+#if defined(CONFIG_THERMAL_HWMON)
+
/* hwmon sys I/F */
#include <linux/hwmon.h>
static LIST_HEAD(thermal_hwmon_list);
diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index 06d3e6e..917707e 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -66,8 +66,7 @@ struct thermal_cooling_device {
((long)t-2732+5)/10 : ((long)t-2732-5)/10)
#define CELSIUS_TO_KELVIN(t) ((t)*10+2732)

-#if defined(CONFIG_HWMON) || \
- (defined(CONFIG_HWMON_MODULE) && defined(CONFIG_THERMAL_MODULE))
+#if defined(CONFIG_THERMAL_HWMON)
/* thermal zone devices with the same type share one hwmon device */
struct thermal_hwmon_device {
char type[THERMAL_NAME_LENGTH];
@@ -94,8 +93,7 @@ struct thermal_zone_device {
struct idr idr;
struct mutex lock; /* protect cooling devices list */
struct list_head node;
-#if defined(CONFIG_HWMON) || \
- (defined(CONFIG_HWMON_MODULE) && defined(CONFIG_THERMAL_MODULE))
+#if defined(CONFIG_THERMAL_HWMON)
struct list_head hwmon_node;
struct thermal_hwmon_device *hwmon;
struct thermal_hwmon_attr temp_input; /* hwmon sys attr */
--
1.5.5


Attachments:
0001-acpi-thermal-allow-disabling-of-thermal-zone-hwmon.patch (4.31 kB)