2006-08-09 12:27:31

by Rafael J. Wysocki

[permalink] [raw]
Subject: 2.6.18-rc4 (and earlier): CMOS clock corruption during suspend to disk on i386

Hi,

It looks like the CMOS clock gets corrupted during the suspend to disk
on i386. I've observed this on 2 different boxes. Moreover, one of them is
AMD64-based and the x86_64 kernel doesn't have this problem on it.

Also, I've done some tests that indicate the corruption doesn't occur before
saving the suspend image. It rather happens when the box is powered off
or rebooted (tested both cases).

Unfortunately, I have no more time to debug it further right now.

Greetings,
Rafael


2006-08-09 12:31:10

by Pavel Machek

[permalink] [raw]
Subject: Re: 2.6.18-rc4 (and earlier): CMOS clock corruption during suspend to disk on i386

Hi!
>
> It looks like the CMOS clock gets corrupted during the suspend to disk
> on i386. I've observed this on 2 different boxes. Moreover, one of them is
> AMD64-based and the x86_64 kernel doesn't have this problem on it.
>
> Also, I've done some tests that indicate the corruption doesn't occur before
> saving the suspend image. It rather happens when the box is powered off
> or rebooted (tested both cases).
>
> Unfortunately, I have no more time to debug it further right now.

Do you have Linus' "please corrupt my cmos for debuggin" hack enabled?
:-).
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-08-09 17:44:07

by john stultz

[permalink] [raw]
Subject: Re: 2.6.18-rc4 (and earlier): CMOS clock corruption during suspend to disk on i386

On Wed, 2006-08-09 at 14:26 +0200, Rafael J. Wysocki wrote:
> It looks like the CMOS clock gets corrupted during the suspend to disk
> on i386. I've observed this on 2 different boxes. Moreover, one of them is
> AMD64-based and the x86_64 kernel doesn't have this problem on it.
>
> Also, I've done some tests that indicate the corruption doesn't occur before
> saving the suspend image. It rather happens when the box is powered off
> or rebooted (tested both cases).

Hmmm. Could you better describe the corruption you're seeing?

I've just gotten a report about uptime reporting odd values after resume
when the CMOS clock was set to the past during a suspend to disk, but
that's somewhat expected and I would think it would occur on x86_64 as
well.

thanks
-john


2006-08-09 20:02:20

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.18-rc4 (and earlier): CMOS clock corruption during suspend to disk on i386

On Wednesday 09 August 2006 14:30, Pavel Machek wrote:
> Hi!
> >
> > It looks like the CMOS clock gets corrupted during the suspend to disk
> > on i386. I've observed this on 2 different boxes. Moreover, one of them is
> > AMD64-based and the x86_64 kernel doesn't have this problem on it.
> >
> > Also, I've done some tests that indicate the corruption doesn't occur before
> > saving the suspend image. It rather happens when the box is powered off
> > or rebooted (tested both cases).
> >
> > Unfortunately, I have no more time to debug it further right now.
>
> Do you have Linus' "please corrupt my cmos for debuggin" hack enabled?

Well, I know nothing about that. ;-)

Rafael

2006-08-09 20:04:51

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.18-rc4 (and earlier): CMOS clock corruption during suspend to disk on i386

On Wednesday 09 August 2006 19:44, john stultz wrote:
> On Wed, 2006-08-09 at 14:26 +0200, Rafael J. Wysocki wrote:
> > It looks like the CMOS clock gets corrupted during the suspend to disk
> > on i386. I've observed this on 2 different boxes. Moreover, one of them is
> > AMD64-based and the x86_64 kernel doesn't have this problem on it.
> >
> > Also, I've done some tests that indicate the corruption doesn't occur before
> > saving the suspend image. It rather happens when the box is powered off
> > or rebooted (tested both cases).
>
> Hmmm. Could you better describe the corruption you're seeing?

After I do "echo disk > /sys/power/state" and the system suspends, the
CMOS clock settings, as visible via the BIOS setup, are more or less random.

Rafael

2006-08-09 20:12:51

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.18-rc4 (and earlier): CMOS clock corruption during suspend to disk on i386

On Wed, 9 Aug 2006 22:01:42 +0200
"Rafael J. Wysocki" <[email protected]> wrote:

> On Wednesday 09 August 2006 14:30, Pavel Machek wrote:
> > Hi!
> > >
> > > It looks like the CMOS clock gets corrupted during the suspend to disk
> > > on i386. I've observed this on 2 different boxes. Moreover, one of them is
> > > AMD64-based and the x86_64 kernel doesn't have this problem on it.
> > >
> > > Also, I've done some tests that indicate the corruption doesn't occur before
> > > saving the suspend image. It rather happens when the box is powered off
> > > or rebooted (tested both cases).
> > >
> > > Unfortunately, I have no more time to debug it further right now.
> >
> > Do you have Linus' "please corrupt my cmos for debuggin" hack enabled?
>
> Well, I know nothing about that. ;-)
>

CONFIG_PM_TRACE=y will scrog your CMOS clock each time you suspend.

2006-08-09 20:13:51

by john stultz

[permalink] [raw]
Subject: Re: 2.6.18-rc4 (and earlier): CMOS clock corruption during suspend to disk on i386

On Wed, 2006-08-09 at 22:04 +0200, Rafael J. Wysocki wrote:
> On Wednesday 09 August 2006 19:44, john stultz wrote:
> > On Wed, 2006-08-09 at 14:26 +0200, Rafael J. Wysocki wrote:
> > > It looks like the CMOS clock gets corrupted during the suspend to disk
> > > on i386. I've observed this on 2 different boxes. Moreover, one of them is
> > > AMD64-based and the x86_64 kernel doesn't have this problem on it.
> > >
> > > Also, I've done some tests that indicate the corruption doesn't occur before
> > > saving the suspend image. It rather happens when the box is powered off
> > > or rebooted (tested both cases).
> >
> > Hmmm. Could you better describe the corruption you're seeing?
>
> After I do "echo disk > /sys/power/state" and the system suspends, the
> CMOS clock settings, as visible via the BIOS setup, are more or less random.

And after resuming does time output the time/date properly, or is it
confused as well?

thanks
-john



2006-08-09 20:52:58

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.18-rc4 (and earlier): CMOS clock corruption during suspend to disk on i386

On Wednesday 09 August 2006 22:12, Andrew Morton wrote:
> On Wed, 9 Aug 2006 22:01:42 +0200
> "Rafael J. Wysocki" <[email protected]> wrote:
>
> > On Wednesday 09 August 2006 14:30, Pavel Machek wrote:
> > > Hi!
> > > >
> > > > It looks like the CMOS clock gets corrupted during the suspend to disk
> > > > on i386. I've observed this on 2 different boxes. Moreover, one of them is
> > > > AMD64-based and the x86_64 kernel doesn't have this problem on it.
> > > >
> > > > Also, I've done some tests that indicate the corruption doesn't occur before
> > > > saving the suspend image. It rather happens when the box is powered off
> > > > or rebooted (tested both cases).
> > > >
> > > > Unfortunately, I have no more time to debug it further right now.
> > >
> > > Do you have Linus' "please corrupt my cmos for debuggin" hack enabled?
> >
> > Well, I know nothing about that. ;-)
> >
>
> CONFIG_PM_TRACE=y will scrog your CMOS clock each time you suspend.

Oh dear. Of course it's set in my .config. Thanks a lot for this hint. :-)

BTW, it's a dangerous setting, because some drivers get mad if the time after
the resume appears to be earlier than the time before the suspend. Also the
timer .suspend/.resume routines aren't prepared for that.

2006-08-10 00:12:46

by Pavel Machek

[permalink] [raw]
Subject: Re: 2.6.18-rc4 (and earlier): CMOS clock corruption during suspend to disk on i386

Hi!

> > > > > It looks like the CMOS clock gets corrupted during the suspend to disk
> > > > > on i386. I've observed this on 2 different boxes. Moreover, one of them is
> > > > > AMD64-based and the x86_64 kernel doesn't have this problem on it.
> > > > >
> > > > > Also, I've done some tests that indicate the corruption doesn't occur before
> > > > > saving the suspend image. It rather happens when the box is powered off
> > > > > or rebooted (tested both cases).
> > > > >
> > > > > Unfortunately, I have no more time to debug it further right now.
> > > >
> > > > Do you have Linus' "please corrupt my cmos for debuggin" hack enabled?
> > >
> > > Well, I know nothing about that. ;-)
> >
> > CONFIG_PM_TRACE=y will scrog your CMOS clock each time you suspend.
>
> Oh dear. Of course it's set in my .config. Thanks a lot for this hint. :-)
>
> BTW, it's a dangerous setting, because some drivers get mad if the time after
> the resume appears to be earlier than the time before the suspend. Also the
> timer .suspend/.resume routines aren't prepared for that.

Its config option should just go away. People comfortable using *that*
should just edit some header file. Rafael, could you do patch doing
something like that?

--
Thanks for all the (sleeping) penguins.

2006-08-10 12:16:30

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.18-rc4 (and earlier): CMOS clock corruption during suspend to disk on i386

Hi,

On Thursday 10 August 2006 02:12, Pavel Machek wrote:
> > > > > > It looks like the CMOS clock gets corrupted during the suspend to disk
> > > > > > on i386. I've observed this on 2 different boxes. Moreover, one of them is
> > > > > > AMD64-based and the x86_64 kernel doesn't have this problem on it.
> > > > > >
> > > > > > Also, I've done some tests that indicate the corruption doesn't occur before
> > > > > > saving the suspend image. It rather happens when the box is powered off
> > > > > > or rebooted (tested both cases).
> > > > > >
> > > > > > Unfortunately, I have no more time to debug it further right now.
> > > > >
> > > > > Do you have Linus' "please corrupt my cmos for debuggin" hack enabled?
> > > >
> > > > Well, I know nothing about that. ;-)
> > >
> > > CONFIG_PM_TRACE=y will scrog your CMOS clock each time you suspend.
> >
> > Oh dear. Of course it's set in my .config. Thanks a lot for this hint. :-)
> >
> > BTW, it's a dangerous setting, because some drivers get mad if the time after
> > the resume appears to be earlier than the time before the suspend. Also the
> > timer .suspend/.resume routines aren't prepared for that.
>
> Its config option should just go away. People comfortable using *that*
> should just edit some header file. Rafael, could you do patch doing
> something like that?

Just remove the option from Kconfig or the whole setting?

Shouldn't we also change the timer .resume() routines to check if the time
after the resume is later than (or at least the same as) the time before the
suspend and set the "sleep length" to 0 if not?

Rafael

2006-08-10 20:52:24

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.18-rc4 (and earlier): CMOS clock corruption during suspend to disk on i386

Hi,

On Thursday 10 August 2006 14:15, Rafael J. Wysocki wrote:
> On Thursday 10 August 2006 02:12, Pavel Machek wrote:
> > > > > > > It looks like the CMOS clock gets corrupted during the suspend to disk
> > > > > > > on i386. I've observed this on 2 different boxes. Moreover, one of them is
> > > > > > > AMD64-based and the x86_64 kernel doesn't have this problem on it.
> > > > > > >
> > > > > > > Also, I've done some tests that indicate the corruption doesn't occur before
> > > > > > > saving the suspend image. It rather happens when the box is powered off
> > > > > > > or rebooted (tested both cases).
> > > > > > >
> > > > > > > Unfortunately, I have no more time to debug it further right now.
> > > > > >
> > > > > > Do you have Linus' "please corrupt my cmos for debuggin" hack enabled?
> > > > >
> > > > > Well, I know nothing about that. ;-)
> > > >
> > > > CONFIG_PM_TRACE=y will scrog your CMOS clock each time you suspend.
> > >
> > > Oh dear. Of course it's set in my .config. Thanks a lot for this hint. :-)
> > >
> > > BTW, it's a dangerous setting, because some drivers get mad if the time after
> > > the resume appears to be earlier than the time before the suspend. Also the
> > > timer .suspend/.resume routines aren't prepared for that.
> >
> > Its config option should just go away. People comfortable using *that*
> > should just edit some header file. Rafael, could you do patch doing
> > something like that?
>
> Just remove the option from Kconfig or the whole setting?
>
> Shouldn't we also change the timer .resume() routines to check if the time
> after the resume is later than (or at least the same as) the time before the
> suspend and set the "sleep length" to 0 if not?

Hm, I'm thinking it may actually be useful to have in Kconfig and if we change
the timer resume to detect the dangerous situation and prevent it from
happening, that should be sufficient.

Rafael

2006-08-12 19:08:39

by Jesse Brandeburg

[permalink] [raw]
Subject: Re: 2.6.18-rc4 (and earlier): CMOS clock corruption during suspend to disk on i386

On 8/10/06, Rafael J. Wysocki <[email protected]> wrote:
> > > > > CONFIG_PM_TRACE=y will scrog your CMOS clock each time you suspend.
> > > >
> > > > Oh dear. Of course it's set in my .config. Thanks a lot for this hint. :-)
> > > >
> > > > BTW, it's a dangerous setting, because some drivers get mad if the time after
> > > > the resume appears to be earlier than the time before the suspend. Also the
> > > > timer .suspend/.resume routines aren't prepared for that.
> > >
> > > Its config option should just go away. People comfortable using *that*
> > > should just edit some header file. Rafael, could you do patch doing
> > > something like that?

I've seen this problem too, thought it was only mm.
Should the problem go away if I disable CONFIG_PM_TRACE?

2006-08-12 19:12:47

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.18-rc4 (and earlier): CMOS clock corruption during suspend to disk on i386

On Saturday 12 August 2006 21:08, Jesse Brandeburg wrote:
> On 8/10/06, Rafael J. Wysocki <[email protected]> wrote:
> > > > > > CONFIG_PM_TRACE=y will scrog your CMOS clock each time you suspend.
> > > > >
> > > > > Oh dear. Of course it's set in my .config. Thanks a lot for this hint. :-)
> > > > >
> > > > > BTW, it's a dangerous setting, because some drivers get mad if the time after
> > > > > the resume appears to be earlier than the time before the suspend. Also the
> > > > > timer .suspend/.resume routines aren't prepared for that.
> > > >
> > > > Its config option should just go away. People comfortable using *that*
> > > > should just edit some header file. Rafael, could you do patch doing
> > > > something like that?
>
> I've seen this problem too, thought it was only mm.
> Should the problem go away if I disable CONFIG_PM_TRACE?

Yes.

2006-08-13 22:33:37

by Pavel Machek

[permalink] [raw]
Subject: Re: 2.6.18-rc4 (and earlier): CMOS clock corruption during suspend to disk on i386

On Thu 2006-08-10 14:15:21, Rafael J. Wysocki wrote:
> Hi,
>
> On Thursday 10 August 2006 02:12, Pavel Machek wrote:
> > > > > > > It looks like the CMOS clock gets corrupted during the suspend to disk
> > > > > > > on i386. I've observed this on 2 different boxes. Moreover, one of them is
> > > > > > > AMD64-based and the x86_64 kernel doesn't have this problem on it.
> > > > > > >
> > > > > > > Also, I've done some tests that indicate the corruption doesn't occur before
> > > > > > > saving the suspend image. It rather happens when the box is powered off
> > > > > > > or rebooted (tested both cases).
> > > > > > >
> > > > > > > Unfortunately, I have no more time to debug it further right now.
> > > > > >
> > > > > > Do you have Linus' "please corrupt my cmos for debuggin" hack enabled?
> > > > >
> > > > > Well, I know nothing about that. ;-)
> > > >
> > > > CONFIG_PM_TRACE=y will scrog your CMOS clock each time you suspend.
> > >
> > > Oh dear. Of course it's set in my .config. Thanks a lot for this hint. :-)
> > >
> > > BTW, it's a dangerous setting, because some drivers get mad if the time after
> > > the resume appears to be earlier than the time before the suspend. Also the
> > > timer .suspend/.resume routines aren't prepared for that.
> >
> > Its config option should just go away. People comfortable using *that*
> > should just edit some header file. Rafael, could you do patch doing
> > something like that?
>
> Just remove the option from Kconfig or the whole setting?

Removing it from Kconfig should be enough.
Pavel

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-08-13 22:34:14

by Pavel Machek

[permalink] [raw]
Subject: Re: 2.6.18-rc4 (and earlier): CMOS clock corruption during suspend to disk on i386

On Thu 2006-08-10 22:51:20, Rafael J. Wysocki wrote:
> Hi,
>
> On Thursday 10 August 2006 14:15, Rafael J. Wysocki wrote:
> > On Thursday 10 August 2006 02:12, Pavel Machek wrote:
> > > > > > > > It looks like the CMOS clock gets corrupted during the suspend to disk
> > > > > > > > on i386. I've observed this on 2 different boxes. Moreover, one of them is
> > > > > > > > AMD64-based and the x86_64 kernel doesn't have this problem on it.
> > > > > > > >
> > > > > > > > Also, I've done some tests that indicate the corruption doesn't occur before
> > > > > > > > saving the suspend image. It rather happens when the box is powered off
> > > > > > > > or rebooted (tested both cases).
> > > > > > > >
> > > > > > > > Unfortunately, I have no more time to debug it further right now.
> > > > > > >
> > > > > > > Do you have Linus' "please corrupt my cmos for debuggin" hack enabled?
> > > > > >
> > > > > > Well, I know nothing about that. ;-)
> > > > >
> > > > > CONFIG_PM_TRACE=y will scrog your CMOS clock each time you suspend.
> > > >
> > > > Oh dear. Of course it's set in my .config. Thanks a lot for this hint. :-)
> > > >
> > > > BTW, it's a dangerous setting, because some drivers get mad if the time after
> > > > the resume appears to be earlier than the time before the suspend. Also the
> > > > timer .suspend/.resume routines aren't prepared for that.
> > >
> > > Its config option should just go away. People comfortable using *that*
> > > should just edit some header file. Rafael, could you do patch doing
> > > something like that?
> >
> > Just remove the option from Kconfig or the whole setting?
> >
> > Shouldn't we also change the timer .resume() routines to check if the time
> > after the resume is later than (or at least the same as) the time before the
> > suspend and set the "sleep length" to 0 if not?
>
> Hm, I'm thinking it may actually be useful to have in Kconfig and if we change
> the timer resume to detect the dangerous situation and prevent it from
> happening, that should be sufficient.

Well, it is still too easy to shoot yourself in the foot. Your time
will be wrong if you enable innocent-sounding option.
Pavel

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-08-14 08:43:47

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.18-rc4 (and earlier): CMOS clock corruption during suspend to disk on i386

On Monday 14 August 2006 00:33, Pavel Machek wrote:
> On Thu 2006-08-10 22:51:20, Rafael J. Wysocki wrote:
]--snip--[
> > > > > BTW, it's a dangerous setting, because some drivers get mad if the time after
> > > > > the resume appears to be earlier than the time before the suspend. Also the
> > > > > timer .suspend/.resume routines aren't prepared for that.
> > > >
> > > > Its config option should just go away. People comfortable using *that*
> > > > should just edit some header file. Rafael, could you do patch doing
> > > > something like that?
> > >
> > > Just remove the option from Kconfig or the whole setting?
> > >
> > > Shouldn't we also change the timer .resume() routines to check if the time
> > > after the resume is later than (or at least the same as) the time before the
> > > suspend and set the "sleep length" to 0 if not?
> >
> > Hm, I'm thinking it may actually be useful to have in Kconfig and if we change
> > the timer resume to detect the dangerous situation and prevent it from
> > happening, that should be sufficient.
>
> Well, it is still too easy to shoot yourself in the foot. Your time
> will be wrong if you enable innocent-sounding option.

We can add DANGEROUS to it. :-)

Rafael