2005-10-25 13:30:14

by Jack Howarth

[permalink] [raw]
Subject: W2100Z and acpi

Have there been any reports of issues with acpi on the Sun
dual Opteron W2100Z for any of the 2.6.12 or 2.6.13 kernels. We
were suffering a slew of random shutdowns yesterday on our machine
running Fedora Core 4 and their 2.6.12-1.1456_FC4smp kernel build.
The machine which had acpi enabled in chkconfig was randoming shutting
down with errors of...

Oct 22 05:01:23 XXXX kernel: Critical temperature reached (68 C), shutting down.
Oct 22 05:01:23 XXXX kernel: Critical temperature reached (68 C), shutting down.

...in the system log. We have switched to their latest 2.6.13-1.1532_FC4smp
kernel and the random thermal shutdowns have ceased. I still plan to update
the BIOS using the Supplemental 2.1 cd from Sun today, but was wondering
if this was a known problem. I am assuming that the errora above have to
be from acpi. Also, looking in /proc/acpi/thermal_zone/THRS, I see...

cat /proc/acpi/thermal_zone/THRS/trip_points
critical (S5): 65 C

and

cat /proc/acpi/thermal_zone/THRS/temperature
temperature: 43 C

under the latest kernel with both CPUs running at 100% during numerical
calculations. I am unclear how exactly the temperature value is determined.
I tried installing lm_sensors but unfortunately the required driver for
the fan/temp control chipset doesn't exist yet for the W2100Z. Thanks in
advance for any advice on debugging this further so we don't run into
it again.
Jack