2006-08-08 21:40:29

by Steven Rostedt

[permalink] [raw]
Subject: swsusp and suspend2 like to overheat my laptop


A few months ago, I installed suspend2 on my laptop. It worked great for
a few days, when suddenly my laptop started to get very hot and the fan
costantly went off, and then I started getting these:

---
Message from syslogd@localhost at Tue Aug 8 16:08:53 2006 ...
localhost kernel: CPU0: Temperature above threshold

Message from syslogd@localhost at Tue Aug 8 16:08:53 2006 ...
localhost kernel: CPU1: Temperature above threshold


Message from syslogd@localhost at Tue Aug 8 16:08:53 2006 ...
localhost kernel: CPU0: Running in modulated clock mode

Message from syslogd@localhost at Tue Aug 8 16:08:53 2006 ...
localhost kernel: CPU1: Running in modulated clock mode
---

I even posted once since I thought I found the problem, but I was wrong.
So I decided to remove Suspend2 and go back to the normal kernel.

Recently, I've decided to try out swsusp. Well, it has been working fine
for almost a week now. But unfortunately, I just started to have my fan
go off constantly, and I'm getting the above messages again (hence why
the date on the messages is today). Checking out the temp, it's going into
the high 70C. That's not too bad, but it only happens when suspending
every night instead of shutting down.

This is a Thinkpad G41, with a P4HT and this is a unmodified 2.6.18-rc2
kernel. I guess I'll have to start shutting down again, and only suspend
every so often. But just thought I'd let the people of knowledge know.

-- Steve


2006-08-08 21:54:43

by Nigel Cunningham

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop

Hi Steven.

On Wednesday 09 August 2006 07:40, Steven Rostedt wrote:
> A few months ago, I installed suspend2 on my laptop. It worked great for
> a few days, when suddenly my laptop started to get very hot and the fan
> costantly went off, and then I started getting these:
>
> ---
> Message from syslogd@localhost at Tue Aug 8 16:08:53 2006 ...
> localhost kernel: CPU0: Temperature above threshold
>
> Message from syslogd@localhost at Tue Aug 8 16:08:53 2006 ...
> localhost kernel: CPU1: Temperature above threshold
>
>
> Message from syslogd@localhost at Tue Aug 8 16:08:53 2006 ...
> localhost kernel: CPU0: Running in modulated clock mode
>
> Message from syslogd@localhost at Tue Aug 8 16:08:53 2006 ...
> localhost kernel: CPU1: Running in modulated clock mode
> ---
>
> I even posted once since I thought I found the problem, but I was wrong.
> So I decided to remove Suspend2 and go back to the normal kernel.
>
> Recently, I've decided to try out swsusp. Well, it has been working fine
> for almost a week now. But unfortunately, I just started to have my fan
> go off constantly, and I'm getting the above messages again (hence why
> the date on the messages is today). Checking out the temp, it's going into
> the high 70C. That's not too bad, but it only happens when suspending
> every night instead of shutting down.
>
> This is a Thinkpad G41, with a P4HT and this is a unmodified 2.6.18-rc2
> kernel. I guess I'll have to start shutting down again, and only suspend
> every so often. But just thought I'd let the people of knowledge know.

The problem will be ACPI related, not particular to swsusp or Suspend2, which
is why you're seeing it with both implementations. I would suggest that you
contact the ACPI guys, and also look to see whether there is a bios update
available and/or a DSDT override for your machine. The later will help if the
problem is with your particular machine's ACPI support, the former if it's a
more general ACPI issue.

Regards,

Nigel
--
See http://www.suspend2.net for Howtos, FAQs, mailing
lists, wiki and bugzilla info.


Attachments:
(No filename) (2.04 kB)
(No filename) (189.00 B)
Download all attachments

2006-08-08 23:31:50

by Steven Rostedt

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop


On Wed, 9 Aug 2006, Nigel Cunningham wrote:

>
> The problem will be ACPI related, not particular to swsusp or Suspend2, which
> is why you're seeing it with both implementations. I would suggest that you
> contact the ACPI guys, and also look to see whether there is a bios update
> available and/or a DSDT override for your machine. The later will help if the
> problem is with your particular machine's ACPI support, the former if it's a
> more general ACPI issue.
>

Thanks for the response Nigel,

There does exist a recent bios update for this machine:

http://www-307.ibm.com/pc/support/site.wss/document.do?sitestyle=lenovo&lndocid=MIGR-58127

Hmm, it requires windows, and I've already wiped out that partition. I
did a search but it seems really scary to update the BIOS via Linux.

Anyone else out there have a Thinkpad G41 and has successfully upgraded
their BIOS?

Thanks,

-- Steve

2006-08-08 23:35:56

by Lee Revell

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop

On Tue, 2006-08-08 at 19:31 -0400, Steven Rostedt wrote:
> On Wed, 9 Aug 2006, Nigel Cunningham wrote:
>
> >
> > The problem will be ACPI related, not particular to swsusp or Suspend2, which
> > is why you're seeing it with both implementations. I would suggest that you
> > contact the ACPI guys, and also look to see whether there is a bios update
> > available and/or a DSDT override for your machine. The later will help if the
> > problem is with your particular machine's ACPI support, the former if it's a
> > more general ACPI issue.
> >
>
> Thanks for the response Nigel,
>
> There does exist a recent bios update for this machine:
>
> http://www-307.ibm.com/pc/support/site.wss/document.do?sitestyle=lenovo&lndocid=MIGR-58127
>
> Hmm, it requires windows, and I've already wiped out that partition. I
> did a search but it seems really scary to update the BIOS via Linux.
>
> Anyone else out there have a Thinkpad G41 and has successfully upgraded
> their BIOS?

I would just report it to the ACPI people. It's a bug if Linux does not
work with the same BIOS + DSDT that the other OS works on.

Lee

2006-08-08 23:41:58

by Nigel Cunningham

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop

Hi.

On Wednesday 09 August 2006 09:35, Lee Revell wrote:
> On Tue, 2006-08-08 at 19:31 -0400, Steven Rostedt wrote:
> > On Wed, 9 Aug 2006, Nigel Cunningham wrote:
> > > The problem will be ACPI related, not particular to swsusp or Suspend2,
> > > which is why you're seeing it with both implementations. I would
> > > suggest that you contact the ACPI guys, and also look to see whether
> > > there is a bios update available and/or a DSDT override for your
> > > machine. The later will help if the problem is with your particular
> > > machine's ACPI support, the former if it's a more general ACPI issue.
> >
> > Thanks for the response Nigel,
> >
> > There does exist a recent bios update for this machine:
> >
> > http://www-307.ibm.com/pc/support/site.wss/document.do?sitestyle=lenovo&l
> >ndocid=MIGR-58127
> >
> > Hmm, it requires windows, and I've already wiped out that partition. I
> > did a search but it seems really scary to update the BIOS via Linux.
> >
> > Anyone else out there have a Thinkpad G41 and has successfully upgraded
> > their BIOS?
>
> I would just report it to the ACPI people. It's a bug if Linux does not
> work with the same BIOS + DSDT that the other OS works on.

True. I was assuming (perhaps wrongly?) that Steven is interested in both
getting the bug fixed and being able to hibernate while he waits for the ACPI
guys to achieve bug-for-bug compatibility with M$; hence suggesting doing
both.

Regards,

Nigel
--
See http://www.suspend2.net for Howtos, FAQs, mailing
lists, wiki and bugzilla info.


Attachments:
(No filename) (1.51 kB)
(No filename) (189.00 B)
Download all attachments

2006-08-08 23:50:25

by Steven Rostedt

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop


On Wed, 9 Aug 2006, Nigel Cunningham wrote:

> Hi.
>
> On Wednesday 09 August 2006 09:35, Lee Revell wrote:
> > On Tue, 2006-08-08 at 19:31 -0400, Steven Rostedt wrote:
> > > On Wed, 9 Aug 2006, Nigel Cunningham wrote:
> > > > The problem will be ACPI related, not particular to swsusp or Suspend2,
> > > > which is why you're seeing it with both implementations. I would
> > > > suggest that you contact the ACPI guys, and also look to see whether
> > > > there is a bios update available and/or a DSDT override for your
> > > > machine. The later will help if the problem is with your particular
> > > > machine's ACPI support, the former if it's a more general ACPI issue.
> > >
> > > Thanks for the response Nigel,
> > >
> > > There does exist a recent bios update for this machine:
> > >
> > > http://www-307.ibm.com/pc/support/site.wss/document.do?sitestyle=lenovo&l
> > >ndocid=MIGR-58127
> > >
> > > Hmm, it requires windows, and I've already wiped out that partition. I
> > > did a search but it seems really scary to update the BIOS via Linux.
> > >
> > > Anyone else out there have a Thinkpad G41 and has successfully upgraded
> > > their BIOS?
> >
> > I would just report it to the ACPI people. It's a bug if Linux does not
> > work with the same BIOS + DSDT that the other OS works on.
>
> True. I was assuming (perhaps wrongly?) that Steven is interested in both
> getting the bug fixed and being able to hibernate while he waits for the ACPI
> guys to achieve bug-for-bug compatibility with M$; hence suggesting doing
> both.
>

I would prefer to do both, but I really can't tell you if the $OTHER_OS
works or not. I booted it once with this machine, and that was only to
register it with IBM. ;)

After that, I slapped in my Debian install CD and the rest is history.


-- Steve

2006-08-08 23:54:12

by Pavel Machek

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop

Hi!

> A few months ago, I installed suspend2 on my laptop. It worked great for
> a few days, when suddenly my laptop started to get very hot and the fan
> costantly went off, and then I started getting these:

I take it as "if I keep it for a week powered off, it will not do
this".

> ---
> Message from syslogd@localhost at Tue Aug 8 16:08:53 2006 ...
> localhost kernel: CPU0: Temperature above threshold
>
> Message from syslogd@localhost at Tue Aug 8 16:08:53 2006 ...
> localhost kernel: CPU1: Temperature above threshold
>
>
> Message from syslogd@localhost at Tue Aug 8 16:08:53 2006 ...
> localhost kernel: CPU0: Running in modulated clock mode
>
> Message from syslogd@localhost at Tue Aug 8 16:08:53 2006 ...
> localhost kernel: CPU1: Running in modulated clock mode
> ---

P4 has thermal protection, so you are actually safe.

Nigel is right, this is acpi problem, but I guess we can help it. Do
you have /proc/acpi/fan? Do you have /proc/acpi/ibm/fan? Can you try
playing with them?

And yes, this should go into bugzilla.kernel.org.

> Recently, I've decided to try out swsusp. Well, it has been working fine
> for almost a week now. But unfortunately, I just started to have my fan
> go off constantly, and I'm getting the above messages again (hence why
> the date on the messages is today). Checking out the temp, it's going into
> the high 70C. That's not too bad, but it only happens when suspending
> every night instead of shutting down.

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-08-09 02:23:41

by Steven Rostedt

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop


On Wed, 9 Aug 2006, Pavel Machek wrote:

> Hi!
>
> > A few months ago, I installed suspend2 on my laptop. It worked great for
> > a few days, when suddenly my laptop started to get very hot and the fan
> > costantly went off, and then I started getting these:
>
> I take it as "if I keep it for a week powered off, it will not do
> this".

Not quite. It's more of, "if I suspend everynight instead of leaving it
running or shutting it down, it will do this" or "if I power off at night
or just leave it running, it will not do this".


>
> > ---
> > Message from syslogd@localhost at Tue Aug 8 16:08:53 2006 ...
> > localhost kernel: CPU0: Temperature above threshold
> >
> > Message from syslogd@localhost at Tue Aug 8 16:08:53 2006 ...
> > localhost kernel: CPU1: Temperature above threshold
> >
> >
> > Message from syslogd@localhost at Tue Aug 8 16:08:53 2006 ...
> > localhost kernel: CPU0: Running in modulated clock mode
> >
> > Message from syslogd@localhost at Tue Aug 8 16:08:53 2006 ...
> > localhost kernel: CPU1: Running in modulated clock mode
> > ---
>
> P4 has thermal protection, so you are actually safe.

Yeah, but still, the keyboard gets pretty hot too, and I'm actually more
worried about damaging something that is close by than damaging the CPU
itself.

>
> Nigel is right, this is acpi problem, but I guess we can help it. Do
> you have /proc/acpi/fan? Do you have /proc/acpi/ibm/fan? Can you try
> playing with them?

hmm, I have /proc/acpi/fan/ but nothing is in it. Here's some configs of
interest:

CONFIG_ACPI=y
CONFIG_ACPI_SLEEP=y
CONFIG_ACPI_SLEEP_PROC_FS=y
# CONFIG_ACPI_SLEEP_PROC_SLEEP is not set
CONFIG_ACPI_AC=m
CONFIG_ACPI_BATTERY=m
CONFIG_ACPI_BUTTON=m
CONFIG_ACPI_VIDEO=m
# CONFIG_ACPI_HOTKEY is not set
CONFIG_ACPI_FAN=m
# CONFIG_ACPI_DOCK is not set
CONFIG_ACPI_PROCESSOR=m
CONFIG_ACPI_HOTPLUG_CPU=y
CONFIG_ACPI_THERMAL=m
# CONFIG_ACPI_ASUS is not set
CONFIG_ACPI_IBM=m
# CONFIG_ACPI_IBM_DOCK is not set
# CONFIG_ACPI_TOSHIBA is not set
CONFIG_ACPI_BLACKLIST_YEAR=2001
CONFIG_ACPI_DEBUG=y
CONFIG_ACPI_EC=y
CONFIG_ACPI_POWER=y
CONFIG_ACPI_SYSTEM=y
CONFIG_X86_PM_TIMER=y
CONFIG_ACPI_CONTAINER=m
# CONFIG_ACPI_SBS is not set


$ lsmod |grep fan
fan 6916 0

$ sudo modprobe ibm_acpi
$ ls /proc/acpi/ibm/
bay bluetooth driver led thermal
beep cmos hotkey light video

No fan there

and still no fan stuff in /proc/acpi/fan.


>
> And yes, this should go into bugzilla.kernel.org.

OK, will do.


Thanks,

-- Steve

2006-08-09 06:15:16

by Ian Campbell

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop

On Tue, 2006-08-08 at 19:31 -0400, Steven Rostedt wrote:
> Hmm, it requires windows, and I've already wiped out that partition.
> I did a search but it seems really scary to update the BIOS via Linux.

I've done it in the past by using memdisk[0] to boot a freedos floppy
image[1] which I have added the BIOS update to... It worked but it was a
bit nerve wracking ;-)

Ian.

[0] http://syslinux.zytor.com/memdisk.php
[1] http://odin.fdos.org/odin2005/ (FWIW I used odin2880.img)

--
Ian Campbell

BOFH excuse #382:

Someone was smoking in the computer room and set off the halon systems.

2006-08-09 07:40:19

by Pavel Machek

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop

Hi!

> > > A few months ago, I installed suspend2 on my laptop. It worked great for
> > > a few days, when suddenly my laptop started to get very hot and the fan
> > > costantly went off, and then I started getting these:
> >
> > I take it as "if I keep it for a week powered off, it will not do
> > this".
>
> Not quite. It's more of, "if I suspend everynight instead of leaving it
> running or shutting it down, it will do this" or "if I power off at night
> or just leave it running, it will not do this".

Okay, can you try to leave it up for a week or two (no suspends, no
poweroffs) and see what happens?

> > P4 has thermal protection, so you are actually safe.
>
> Yeah, but still, the keyboard gets pretty hot too, and I'm actually more
> worried about damaging something that is close by than damaging the CPU
> itself.

If you damage something, machine was misdesigned in the first place.

cat we get contents of /proc/acpi/thermal*/*/* ?

> $ sudo modprobe ibm_acpi
> $ ls /proc/acpi/ibm/
> bay bluetooth driver led thermal
> beep cmos hotkey light video
>
> No fan there

Does ibm/thermal work?

Seems like fan is completely controlled by hardware. What may still
help: either saving or avoiding saving reserved parts of memory. But
this is all magic.

How s2ram works would be useful info.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-08-09 11:45:49

by Steven Rostedt

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop


On Wed, 9 Aug 2006, Pavel Machek wrote:

> Hi!
>
> > > > A few months ago, I installed suspend2 on my laptop. It worked great for
> > > > a few days, when suddenly my laptop started to get very hot and the fan
> > > > costantly went off, and then I started getting these:
> > >
> > > I take it as "if I keep it for a week powered off, it will not do
> > > this".
> >
> > Not quite. It's more of, "if I suspend everynight instead of leaving it
> > running or shutting it down, it will do this" or "if I power off at night
> > or just leave it running, it will not do this".
>
> Okay, can you try to leave it up for a week or two (no suspends, no
> poweroffs) and see what happens?

I've had this laptop running for a couple of months without shutting down
and it doesn't have a problem. The only time that I do shut it down is
when I'm working on site (which I'm doing now). So I only shutdown the
laptop while traveling. When I came across suspend2 (and later swsusp), I
was excited that I didn't need to restart all my applications when leaving
the place of work and coming back. But After being on site for several
days, and using the suspend to disk, I get a hot CPU. But when I've been
on site while shutting down normally when I leave then I don't have a
problem.

>
> > > P4 has thermal protection, so you are actually safe.
> >
> > Yeah, but still, the keyboard gets pretty hot too, and I'm actually more
> > worried about damaging something that is close by than damaging the CPU
> > itself.
>
> If you damage something, machine was misdesigned in the first place.

agreed, but you never know ;) This laptop is currently my lifeline :)

>
> cat we get contents of /proc/acpi/thermal*/*/* ?

I'm running after a poweroff (left it running over night in the hotel, and
I'm still in the hotel).

$ grep . /proc/acpi/thermal_zone/THRM/*
/proc/acpi/thermal_zone/THRM/cooling_mode:<setting not supported>
/proc/acpi/thermal_zone/THRM/cooling_mode:cooling mode: passive
/proc/acpi/thermal_zone/THRM/polling_frequency:<polling disabled>
/proc/acpi/thermal_zone/THRM/state:state: ok
/proc/acpi/thermal_zone/THRM/temperature:temperature: 48 C
/proc/acpi/thermal_zone/THRM/trip_points:critical (S5): 88 C
/proc/acpi/thermal_zone/THRM/trip_points:passive: 81 C: tc1=4 tc2=3 tsp=100 devices=0xcf6c2338

Note thermal_zone/THRM was finished with bash tab completion so they are
the only things that match the above glob expr.

>
> > $ sudo modprobe ibm_acpi
> > $ ls /proc/acpi/ibm/
> > bay bluetooth driver led thermal
> > beep cmos hotkey light video
> >
> > No fan there
>
> Does ibm/thermal work?

$ cat /proc/acpi/ibm/thermal
temperatures: not supported

I guess not.

>
> Seems like fan is completely controlled by hardware. What may still
> help: either saving or avoiding saving reserved parts of memory. But
> this is all magic.
>
> How s2ram works would be useful info.

No idea.

It does look like something isn't setting up the ACPI power properly on
resume, and that the CPU is probably in a busy loop while the machine is
idle. Just a guess.

Thanks for the support,

-- Steve

2006-08-09 11:53:59

by Nigel Cunningham

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop

Hi Steven.

Have you tried building the ACPI modules as modules (if you're not already
doing so), and unloading them while suspending? If not, I'd give that a go.

Regards,

Nigel
--
See http://www.suspend2.net for Howtos, FAQs, mailing
lists, wiki and bugzilla info.


Attachments:
(No filename) (270.00 B)
(No filename) (189.00 B)
Download all attachments

2006-08-09 11:59:04

by Pavel Machek

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop

Hi!

> > Okay, can you try to leave it up for a week or two (no suspends, no
> > poweroffs) and see what happens?
>
> I've had this laptop running for a couple of months without shutting down
> and it doesn't have a problem. The only time that I do shut it down

Ok.

> > > > P4 has thermal protection, so you are actually safe.
> > >
> > > Yeah, but still, the keyboard gets pretty hot too, and I'm actually more
> > > worried about damaging something that is close by than damaging the CPU
> > > itself.
> >
> > If you damage something, machine was misdesigned in the first place.
>
> agreed, but you never know ;) This laptop is currently my lifeline :)

You'd have good reason to get new one.

> > cat we get contents of /proc/acpi/thermal*/*/* ?
>
> I'm running after a poweroff (left it running over night in the hotel, and
> I'm still in the hotel).
>
> $ grep . /proc/acpi/thermal_zone/THRM/*
> /proc/acpi/thermal_zone/THRM/cooling_mode:<setting not supported>
> /proc/acpi/thermal_zone/THRM/cooling_mode:cooling mode: passive
> /proc/acpi/thermal_zone/THRM/polling_frequency:<polling disabled>
> /proc/acpi/thermal_zone/THRM/state:state: ok
> /proc/acpi/thermal_zone/THRM/temperature:temperature: 48 C
> /proc/acpi/thermal_zone/THRM/trip_points:critical (S5): 88 C
> /proc/acpi/thermal_zone/THRM/trip_points:passive: 81 C: tc1=4 tc2=3 tsp=100 devices=0xcf6c2338
>
> Note thermal_zone/THRM was finished with bash tab completion so they are
> the only things that match the above glob expr.

Ok, so it is the bios doing temperature control up-to 81C. At 81C,
linux should start cooling it, and at 88C, linux should shutdown. At
little higher temperature, hardware should emergency shutdown.

> > How s2ram works would be useful info.
>
> No idea.

Well, try it :-). suspend.sf.net.

> It does look like something isn't setting up the ACPI power properly on
> resume, and that the CPU is probably in a busy loop while the machine is
> idle. Just a guess.

Fan is not controlled by ACPI. But we may be saving some memory we
should not save, or something like that.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-08-09 12:04:44

by Steven Rostedt

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop



On Wed, 9 Aug 2006, Steven Rostedt wrote:

> >
> > cat we get contents of /proc/acpi/thermal*/*/* ?
>
> I'm running after a poweroff (left it running over night in the hotel, and
> I'm still in the hotel).
>
> $ grep . /proc/acpi/thermal_zone/THRM/*
> /proc/acpi/thermal_zone/THRM/cooling_mode:<setting not supported>
> /proc/acpi/thermal_zone/THRM/cooling_mode:cooling mode: passive
> /proc/acpi/thermal_zone/THRM/polling_frequency:<polling disabled>
> /proc/acpi/thermal_zone/THRM/state:state: ok
> /proc/acpi/thermal_zone/THRM/temperature:temperature: 48 C
> /proc/acpi/thermal_zone/THRM/trip_points:critical (S5): 88 C
> /proc/acpi/thermal_zone/THRM/trip_points:passive: 81 C: tc1=4 tc2=3 tsp=100 devices=0xcf6c2338
>
> Note thermal_zone/THRM was finished with bash tab completion so they are
> the only things that match the above glob expr.
>

Note: I just did a swsusp and resume and here's the same data:

$ grep . /proc/acpi/thermal_zone/THRM/*
/proc/acpi/thermal_zone/THRM/cooling_mode:<setting not supported>
/proc/acpi/thermal_zone/THRM/cooling_mode:cooling mode: passive
/proc/acpi/thermal_zone/THRM/polling_frequency:<polling disabled>
/proc/acpi/thermal_zone/THRM/state:state: ok
/proc/acpi/thermal_zone/THRM/temperature:temperature: 60 C
/proc/acpi/thermal_zone/THRM/trip_points:critical (S5): 88 C
/proc/acpi/thermal_zone/THRM/trip_points:passive: 81 C: tc1=4 tc2=3 tsp=100 devices=0xcf6c2338


And just leaving my system idle for a few minutes:

$ grep . /proc/acpi/thermal_zone/THRM/temperature
temperature: 62 C

and a few more minutes:

temperature: 64 C


And a few more:

temperature: 66 C


right now after typing this:

temperature: 69 C


So this definitely shows somethings not letting the CPU rest.

-- Steve


2006-08-09 12:07:37

by Andreas Mohr

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop

On Wed, Aug 09, 2006 at 07:45:23AM -0400, Steven Rostedt wrote:
> It does look like something isn't setting up the ACPI power properly on
> resume, and that the CPU is probably in a busy loop while the machine is
> idle. Just a guess.

In that case could you post
cat /proc/acpi/processor/CPU?/*
?

Perhaps we're losing ACPI C2/C3 state power saving, and checking
the busmaster activity indicators there would be useful, too.

Oh, in this context maybe it's actually a problem of a misbehaving driver?
An active USB mouse is known to distort ACPI power saving, causing reduced
notebook battery operation length (due to busmaster activity preventing
ACPI idling, I think). Now what if some certain driver actually caused
permanent busmaster activity...?

Andreas Mohr

2006-08-09 12:09:08

by Pavel Machek

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop

Hi!

> > > cat we get contents of /proc/acpi/thermal*/*/* ?
> >
> > I'm running after a poweroff (left it running over night in the hotel, and
> > I'm still in the hotel).
> >
> > $ grep . /proc/acpi/thermal_zone/THRM/*
> > /proc/acpi/thermal_zone/THRM/cooling_mode:<setting not supported>
> > /proc/acpi/thermal_zone/THRM/cooling_mode:cooling mode: passive
> > /proc/acpi/thermal_zone/THRM/polling_frequency:<polling disabled>
> > /proc/acpi/thermal_zone/THRM/state:state: ok
> > /proc/acpi/thermal_zone/THRM/temperature:temperature: 48 C
> > /proc/acpi/thermal_zone/THRM/trip_points:critical (S5): 88 C
> > /proc/acpi/thermal_zone/THRM/trip_points:passive: 81 C: tc1=4 tc2=3 tsp=100 devices=0xcf6c2338
> >
> > Note thermal_zone/THRM was finished with bash tab completion so they are
> > the only things that match the above glob expr.
> >
>
> Note: I just did a swsusp and resume and here's the same data:
>
> $ grep . /proc/acpi/thermal_zone/THRM/*
> /proc/acpi/thermal_zone/THRM/cooling_mode:<setting not supported>
> /proc/acpi/thermal_zone/THRM/cooling_mode:cooling mode: passive
> /proc/acpi/thermal_zone/THRM/polling_frequency:<polling disabled>
> /proc/acpi/thermal_zone/THRM/state:state: ok
> /proc/acpi/thermal_zone/THRM/temperature:temperature: 60 C
> /proc/acpi/thermal_zone/THRM/trip_points:critical (S5): 88 C
> /proc/acpi/thermal_zone/THRM/trip_points:passive: 81 C: tc1=4 tc2=3 tsp=100 devices=0xcf6c2338
>
>
> And just leaving my system idle for a few minutes:
>
> $ grep . /proc/acpi/thermal_zone/THRM/temperature
> temperature: 62 C
>
> and a few more minutes:
>
> temperature: 64 C
>
>
> And a few more:
>
> temperature: 66 C
>
>
> right now after typing this:
>
> temperature: 69 C
>
>
> So this definitely shows somethings not letting the CPU rest.

Okay, run top to see what goes on, and look for
/proc/acpi/processor/*/* -- you are interested in C states before and
after suspend.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-08-09 12:14:53

by Steven Rostedt

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop


FYI

On Wed, 9 Aug 2006, Steven Rostedt wrote:

>
> Note: I just did a swsusp and resume and here's the same data:
>
> $ grep . /proc/acpi/thermal_zone/THRM/*
> /proc/acpi/thermal_zone/THRM/cooling_mode:<setting not supported>
> /proc/acpi/thermal_zone/THRM/cooling_mode:cooling mode: passive
> /proc/acpi/thermal_zone/THRM/polling_frequency:<polling disabled>
> /proc/acpi/thermal_zone/THRM/state:state: ok
> /proc/acpi/thermal_zone/THRM/temperature:temperature: 60 C
> /proc/acpi/thermal_zone/THRM/trip_points:critical (S5): 88 C
> /proc/acpi/thermal_zone/THRM/trip_points:passive: 81 C: tc1=4 tc2=3 tsp=100 devices=0xcf6c2338
>
>
> And just leaving my system idle for a few minutes:
>
> $ grep . /proc/acpi/thermal_zone/THRM/temperature
> temperature: 62 C
>
> and a few more minutes:
>
> temperature: 64 C
>
>
> And a few more:
>
> temperature: 66 C
>
>
> right now after typing this:
>
> temperature: 69 C
>
>

I just did a soft reboot and right afterwards I get:

$ cat /proc/acpi/thermal_zone/THRM/temperature
temperature: 63 C


waited a few seconds:

temperature: 62 C


After a few minutes (while connecting back home):

temperature: 58 C


And right now:

temperature: 56 C


So it is going the opposite direction after a soft reboot.

-- Steve

2006-08-09 12:16:27

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop

On Wednesday 09 August 2006 13:58, Pavel Machek wrote:
> Hi!
>
> > > Okay, can you try to leave it up for a week or two (no suspends, no
> > > poweroffs) and see what happens?
> >
> > I've had this laptop running for a couple of months without shutting down
> > and it doesn't have a problem. The only time that I do shut it down
>
> Ok.
>
> > > > > P4 has thermal protection, so you are actually safe.
> > > >
> > > > Yeah, but still, the keyboard gets pretty hot too, and I'm actually more
> > > > worried about damaging something that is close by than damaging the CPU
> > > > itself.
> > >
> > > If you damage something, machine was misdesigned in the first place.
> >
> > agreed, but you never know ;) This laptop is currently my lifeline :)
>
> You'd have good reason to get new one.
>
> > > cat we get contents of /proc/acpi/thermal*/*/* ?
> >
> > I'm running after a poweroff (left it running over night in the hotel, and
> > I'm still in the hotel).
> >
> > $ grep . /proc/acpi/thermal_zone/THRM/*
> > /proc/acpi/thermal_zone/THRM/cooling_mode:<setting not supported>
> > /proc/acpi/thermal_zone/THRM/cooling_mode:cooling mode: passive
> > /proc/acpi/thermal_zone/THRM/polling_frequency:<polling disabled>
> > /proc/acpi/thermal_zone/THRM/state:state: ok
> > /proc/acpi/thermal_zone/THRM/temperature:temperature: 48 C
> > /proc/acpi/thermal_zone/THRM/trip_points:critical (S5): 88 C
> > /proc/acpi/thermal_zone/THRM/trip_points:passive: 81 C: tc1=4 tc2=3 tsp=100 devices=0xcf6c2338
> >
> > Note thermal_zone/THRM was finished with bash tab completion so they are
> > the only things that match the above glob expr.
>
> Ok, so it is the bios doing temperature control up-to 81C. At 81C,
> linux should start cooling it, and at 88C, linux should shutdown. At
> little higher temperature, hardware should emergency shutdown.
>
> > > How s2ram works would be useful info.
> >
> > No idea.
>
> Well, try it :-). suspend.sf.net.
>
> > It does look like something isn't setting up the ACPI power properly on
> > resume, and that the CPU is probably in a busy loop while the machine is
> > idle. Just a guess.
>
> Fan is not controlled by ACPI. But we may be saving some memory we
> should not save, or something like that.

If it's a P4, we rather don't, because the ACPI tables should be above the
last pfn in the normal zone. Still, Steven please send your dmesg after a
fresh boot.

Rafael

2006-08-09 12:36:17

by Steven Rostedt

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop


On Wed, 9 Aug 2006, Pavel Machek wrote:

>
> Okay, run top to see what goes on, and look for
> /proc/acpi/processor/*/* -- you are interested in C states before and
> after suspend.
> Pavel

I don't quite understand. What am I looking for in top?

Here's the before and after:

before:

$ grep C /proc/acpi/processor/*/*
/proc/acpi/processor/CPU0/power:active state: C1
/proc/acpi/processor/CPU0/power:max_cstate: C8
/proc/acpi/processor/CPU0/power: *C1: type[C1]
promotion[--] demotion[--] latency[000] usage[00000000] duration[00000000000000000000]
/proc/acpi/processor/CPU1/power:active state: C1
/proc/acpi/processor/CPU1/power:max_cstate: C8
/proc/acpi/processor/CPU1/power: *C1: type[C1]
promotion[--] demotion[--] latency[000] usage[00000000] duration[00000000000000000000]


after:

grep C /proc/acpi/processor/*/*
/proc/acpi/processor/CPU0/power:active state: C1
/proc/acpi/processor/CPU0/power:max_cstate: C8
/proc/acpi/processor/CPU0/power: *C1: type[C1]
promotion[--] demotion[--] latency[000] usage[00000000] duration[00000000000000000000]
/proc/acpi/processor/CPU1/power:active
state: C1
/proc/acpi/processor/CPU1/power:max_cstate: C8
/proc/acpi/processor/CPU1/power: *C1: type[C1]
promotion[--] demotion[--] latency[000] usage[00000000] duration[00000000000000000000]

-- Steve

2006-08-09 12:38:49

by Steven Rostedt

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop



On Wed, 9 Aug 2006, Andreas Mohr wrote:

> On Wed, Aug 09, 2006 at 07:45:23AM -0400, Steven Rostedt wrote:
> > It does look like something isn't setting up the ACPI power properly on
> > resume, and that the CPU is probably in a busy loop while the machine is
> > idle. Just a guess.
>
> In that case could you post
> cat /proc/acpi/processor/CPU?/*
> ?

This is after a suspend:

$ cat /proc/acpi/processor/CPU*/*
processor id: 0
acpi id: 0
bus mastering control: yes
power management: no
throttling control: yes
limit interface: yes
active limit: P0:T0
user limit: P0:T0
thermal limit: P0:T0
active state: C1
max_cstate: C8
bus master activity: 00000000
states:
*C1: type[C1] promotion[--] demotion[--] latency[000]
usage[00000000] duration[00000000000000000000]
state count: 4
active state: T0
states:
*T0: 00%
T1: 25%
T2: 50%
T3: 75%
processor id: 1
acpi id: 1
bus mastering control: yes
power management: no
throttling control: yes
limit interface: yes
active limit: P0:T0
user limit: P0:T0
thermal limit: P0:T0
active state: C1
max_cstate: C8
bus master activity: 00000000
states:
*C1: type[C1] promotion[--] demotion[--] latency[000]
usage[00000000] duration[00000000000000000000]
state count: 4
active state: T0
states:
*T0: 00%
T1: 25%
T2: 50%
T3: 75%



>
> Perhaps we're losing ACPI C2/C3 state power saving, and checking
> the busmaster activity indicators there would be useful, too.
>
> Oh, in this context maybe it's actually a problem of a misbehaving driver?
> An active USB mouse is known to distort ACPI power saving, causing reduced
> notebook battery operation length (due to busmaster activity preventing
> ACPI idling, I think). Now what if some certain driver actually caused
> permanent busmaster activity...?

I unplug everything before doing a suspend. Right now I'm leaving the
network connected so I don't need to go throught the connection process
again.


Got to leave the hotel now, eat breakfast and go to work ;)

-- Steve

2006-08-09 12:58:57

by Pavel Machek

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop

On Wed 2006-08-09 08:35:47, Steven Rostedt wrote:
>
> On Wed, 9 Aug 2006, Pavel Machek wrote:
>
> >
> > Okay, run top to see what goes on, and look for
> > /proc/acpi/processor/*/* -- you are interested in C states before and
> > after suspend.
>
> I don't quite understand. What am I looking for in top?

Some process that is running and eating 99% cpu when it should not be
running and doing anything?
Pavel


> Here's the before and after:
>
> before:
>
> $ grep C /proc/acpi/processor/*/*
> /proc/acpi/processor/CPU0/power:active state: C1
> /proc/acpi/processor/CPU0/power:max_cstate: C8
> /proc/acpi/processor/CPU0/power: *C1: type[C1]
> promotion[--] demotion[--] latency[000] usage[00000000] duration[00000000000000000000]

All zeros? Strange...
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-08-09 13:03:20

by Andreas Mohr

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop

Hi,

On Wed, Aug 09, 2006 at 08:38:27AM -0400, Steven Rostedt wrote:
> This is after a suspend:
>
> $ cat /proc/acpi/processor/CPU*/*
> processor id: 0
> acpi id: 0
> bus mastering control: yes
> power management: no
> throttling control: yes
> limit interface: yes
> active limit: P0:T0
> user limit: P0:T0
> thermal limit: P0:T0
> active state: C1
> max_cstate: C8
> bus master activity: 00000000
> states:
> *C1: type[C1] promotion[--] demotion[--] latency[000]
> usage[00000000] duration[00000000000000000000]
> state count: 4
> active state: T0
> states:
> *T0: 00%
> T1: 25%
> T2: 50%
> T3: 75%

This is almost *exactly* the same as on my very cheap'n stupid HP/Compaq
desktop P4 HT which doesn't support ACPI C2/C3 at all despite proper support
by other P4 HT desktop machines (missing _CST ACPI object in the DSDT,
as confirmed after messing with Intel's DSDT decompiler):

# cat /proc/acpi/processor/CPU?/*
processor id: 0
acpi id: 1
bus mastering control: no
power management: no
throttling control: yes
limit interface: yes
active limit: P0:T0
user limit: P0:T0
thermal limit: P0:T0
active state: C1
max_cstate: C8
bus master activity: 00000000
states:
*C1: type[C1] promotion[--] demotion[--] latency[000] usage[00000000] duration[00000000000000000000]
state count: 8
active state: T0
states:
*T0: 00%
T1: 12%
T2: 25%
T3: 37%
T4: 50%
T5: 62%
T6: 75%
T7: 87%


Note that

max_cstate: C8

can be considered a bug (this is a C state init value from an ACPI define
mistakenly left unchanged in case of missing _CST) since I thus only have C1
and it should thus be set to C1.

What would be interesting is this output *before* any suspend, not after ;)


Oh, and your temperature after boot goes backwards since booting is a very
active period, obviously.

Andreas

2006-08-09 13:05:54

by Mark Lord

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop

Steven Rostedt wrote:
>
> I would prefer to do both, but I really can't tell you if the $OTHER_OS
> works or not. I booted it once with this machine, and that was only to
> register it with IBM. ;)
>
> After that, I slapped in my Debian install CD and the rest is history.

My own solution for BIOS updates was to reinstall the MS stuff completely
to a small old bootable external USB drive, and place it on a shelf for
contingency / testing uses. That way it doesn't even require a boot sector
from the main drive.

Cheers

2006-08-09 13:17:13

by Steven Rostedt

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop


On Wed, 9 Aug 2006, Rafael J. Wysocki wrote:

>
> If it's a P4, we rather don't, because the ACPI tables should be above the
> last pfn in the normal zone. Still, Steven please send your dmesg after a
> fresh boot.
>

Attached is a gzipped version of my dmesg.

-- Steve


Attachments:
dmesg_power.gz (7.64 kB)

2006-08-09 13:35:52

by Steven Rostedt

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop


On Wed, 9 Aug 2006, Pavel Machek wrote:

>
> > > How s2ram works would be useful info.
> >
> > No idea.
>
> Well, try it :-). suspend.sf.net.
>

Debian testing has it installed already, so I tried that one.

# s2ram
Machine is unknown.
This machine can be identified by:
sys_vendor = "IBM"
sys_product = "288679U"
sys_version = "ThinkPad G41"
bios_version = "1XET44WW (1.03 )"
See http://en.opensuse.org/S2ram for details.

If you report a problem, please include the complete output above.



So then I tried s2ram -f

Well it went to sleep fine. But when I tried to wake it up again, the
screen didn't come back. I'm not sure if the keyboard was working either.
But I could eject the CD and when I put it back in, it seemed to mount it.

Oh well, I'll have to debug that another day ;)

-- Steve

2006-08-09 13:37:34

by Pavel Machek

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop

On Wed 2006-08-09 09:35:31, Steven Rostedt wrote:
>
> On Wed, 9 Aug 2006, Pavel Machek wrote:
>
> >
> > > > How s2ram works would be useful info.
> > >
> > > No idea.
> >
> > Well, try it :-). suspend.sf.net.
> >
>
> Debian testing has it installed already, so I tried that one.
>
> # s2ram
> Machine is unknown.
> This machine can be identified by:
> sys_vendor = "IBM"
> sys_product = "288679U"
> sys_version = "ThinkPad G41"
> bios_version = "1XET44WW (1.03 )"
> See http://en.opensuse.org/S2ram for details.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> If you report a problem, please include the complete output above.
>
>
>
> So then I tried s2ram -f
>
> Well it went to sleep fine. But when I tried to wake it up again, the
> screen didn't come back. I'm not sure if the keyboard was working either.
> But I could eject the CD and when I put it back in, it seemed to mount it.
>
> Oh well, I'll have to debug that another day ;)

There's a very nice writeup... at underlined address.

you probably want -f -a 3 .
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-08-09 13:42:13

by Andreas Mohr

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop

On Wed, Aug 09, 2006 at 09:16:30AM -0400, Steven Rostedt wrote:
>
> On Wed, 9 Aug 2006, Rafael J. Wysocki wrote:
>
> >
> > If it's a P4, we rather don't, because the ACPI tables should be above the
> > last pfn in the normal zone. Still, Steven please send your dmesg after a
> > fresh boot.
> >
>
> Attached is a gzipped version of my dmesg.

This one is fatal:

| ACPI: Found ECDT
| ACPI: Could not use ECDT

And you also have

| ACPI: Processor [CPU0] (supports 4 throttling states)
| ACPI: Processor [CPU1] (supports 4 throttling states)

(IOW, no C2/C3 states listed here)

The buggy ECDT table (see http://www.poupinou.org/acpi/ibm_ecdt.html)
is said to cause ACPI init to fail:
http://t2100cdt.kippona.net/tlinux/archive/linux.toshiba-dme.co.jp/ML/tlinux-users/4300/4396.html
as such it's not too astonishing that you don't have C2/C3 states, *always*
(pre-suspend and post-suspend).

However the machine should still do normal HLT idle loop which should
manage to keep it reasonably cool, right?

Given this ECDT table issue it's very possible that this is the reason for
Linux ACPI layer misbehaviour after resume.

Google "ACPI ECDT" might help, too.

In any case, you could do some kernel logging around pm_idle* in
drivers/acpi/processor_idle.c since I suspect that this is what changes
after resume to cause the idling to fail.

Andreas Mohr

2006-08-09 13:45:35

by Brad Campbell

[permalink] [raw]
Subject: Re: [Suspend2-devel] Re: swsusp and suspend2 like to overheat my laptop

Steven Rostedt wrote:
>
> Well it went to sleep fine. But when I tried to wake it up again, the
> screen didn't come back. I'm not sure if the keyboard was working either.
> But I could eject the CD and when I put it back in, it seemed to mount it.
>

Different laptop of course.. but good results can often be had with
s2ram -f -s

Running the full array of command line permutations can be somewhat tedious though. A good initramfs
and separate grub boot entry help there a great deal (no fsck if you lock it up).

Brad
--
"Human beings, who are almost unique in having the ability
to learn from the experience of others, are also remarkable
for their apparent disinclination to do so." -- Douglas Adams

2006-08-09 20:33:14

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop

On Wednesday 09 August 2006 15:16, Steven Rostedt wrote:
>
> On Wed, 9 Aug 2006, Rafael J. Wysocki wrote:
>
> >
> > If it's a P4, we rather don't, because the ACPI tables should be above the
> > last pfn in the normal zone. Still, Steven please send your dmesg after a
> > fresh boot.
> >
>
> Attached is a gzipped version of my dmesg.

Thanks.

I don't think we overwrite anything important from the hardware's perspective.

Rafael

2006-08-11 00:14:42

by Steven Rostedt

[permalink] [raw]
Subject: Re: swsusp and suspend2 like to overheat my laptop


On Wed, 9 Aug 2006, Pavel Machek wrote:

>
> Some process that is running and eating 99% cpu when it should not be
> running and doing anything?
> Pavel

Yeah, I've already looked for a rogue process, but the CPU is actually
pretty idle.

-- Steve