2006-08-13 21:11:06

by Rafael J. Wysocki

[permalink] [raw]
Subject: [RFC][PATCH -mm 0/2] Detect clock skew during suspend

Hi,

If the CMOS timer is changed when the system is suspended to disk in such a
way that the time during the resume turns out to be earlier than the time
before the suspend, the resume often fails and the system hangs (spins
forever in the idle thread) due to driver problems.

For this reason it seems reasonable to make the timer .resume() routines
detect such situations and prevent them from happening, which is done
in the following two patches for i386 and x86_64.

Greetings,
Rafael


--
You never change things by fighting the existing reality.
R. Buckminster Fuller


2006-08-13 21:11:31

by Rafael J. Wysocki

[permalink] [raw]
Subject: [RFC][PATCH -mm 2/2] x86_64: Detect clock skew during suspend

Detect the situations in which the time after a resume from disk would
be earlier than the time before the suspend and prevent them from
happening on x86_64.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
arch/x86_64/kernel/time.c | 10 +++++++++-
1 files changed, 9 insertions(+), 1 deletion(-)

Index: linux-2.6.18-rc4-mm1/arch/x86_64/kernel/time.c
===================================================================
--- linux-2.6.18-rc4-mm1.orig/arch/x86_64/kernel/time.c
+++ linux-2.6.18-rc4-mm1/arch/x86_64/kernel/time.c
@@ -1039,8 +1039,16 @@ static int timer_resume(struct sys_devic
unsigned long flags;
unsigned long sec;
unsigned long ctime = get_cmos_time();
- unsigned long sleep_length = (ctime - sleep_start) * HZ;
+ long sleep_length = (ctime - sleep_start) * HZ;

+ if (sleep_length < 0) {
+ printk(KERN_WARNING "Time skew detected in timer resume!\n");
+ /* The time after the resume must not be earlier than the time
+ * before the suspend or some nasty things will happen
+ */
+ sleep_length = 0;
+ ctime = sleep_start;
+ }
if (vxtime.hpet_address)
hpet_reenable();
else

2006-08-13 21:11:04

by Rafael J. Wysocki

[permalink] [raw]
Subject: [RFC][PATCH -mm 1/2] i386: Detect clock skew during suspend

Detect the situations in which the time after a resume from disk would
be earlier than the time before the suspend and prevent them from
happening on i386.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
arch/i386/kernel/time.c | 24 ++++++++++++++++++------
1 files changed, 18 insertions(+), 6 deletions(-)

Index: linux-2.6.18-rc4-mm1/arch/i386/kernel/time.c
===================================================================
--- linux-2.6.18-rc4-mm1.orig/arch/i386/kernel/time.c
+++ linux-2.6.18-rc4-mm1/arch/i386/kernel/time.c
@@ -285,16 +285,19 @@ void notify_arch_cmos_timer(void)
mod_timer(&sync_cmos_timer, jiffies + 1);
}

-static long clock_cmos_diff, sleep_start;
+static long clock_cmos_diff;
+static unsigned long sleep_start;

static int timer_suspend(struct sys_device *dev, pm_message_t state)
{
/*
* Estimate time zone so that set_time can update the clock
*/
- clock_cmos_diff = -get_cmos_time();
+ unsigned long ctime = get_cmos_time();
+
+ clock_cmos_diff = -ctime;
clock_cmos_diff += get_seconds();
- sleep_start = get_cmos_time();
+ sleep_start = ctime;
return 0;
}

@@ -302,16 +305,25 @@ static int timer_resume(struct sys_devic
{
unsigned long flags;
unsigned long sec;
- unsigned long sleep_length;
+ unsigned long ctime = get_cmos_time();
+ long sleep_length = (ctime - sleep_start) * HZ;
struct timespec ts;
+
+ if (sleep_length < 0) {
+ printk(KERN_WARNING "Time skew detected in timer resume!\n");
+ /* The time after the resume must not be earlier than the time
+ * before the suspend or some nasty things will happen
+ */
+ sleep_length = 0;
+ ctime = sleep_start;
+ }
#ifdef CONFIG_HPET_TIMER
if (is_hpet_enabled())
hpet_reenable();
#endif
setup_pit_timer();
- sec = get_cmos_time() + clock_cmos_diff;
- sleep_length = (get_cmos_time() - sleep_start) * HZ;

+ sec = ctime + clock_cmos_diff;
ts.tv_sec = sec;
ts.tv_nsec = 0;
do_settimeofday(&ts);

2006-08-13 21:22:00

by Pavel Machek

[permalink] [raw]
Subject: Re: [RFC][PATCH -mm 0/2] Detect clock skew during suspend

Hi!

> If the CMOS timer is changed when the system is suspended to disk in such a
> way that the time during the resume turns out to be earlier than the time
> before the suspend, the resume often fails and the system hangs (spins
> forever in the idle thread) due to driver problems.
>
> For this reason it seems reasonable to make the timer .resume() routines
> detect such situations and prevent them from happening, which is done
> in the following two patches for i386 and x86_64.

"Clock skew detected"... Maybe "time going backwards detected" ?

Anyway, patches look good to me, ACK.
Pavel

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-08-14 18:15:37

by Andi Kleen

[permalink] [raw]
Subject: Re: [RFC][PATCH -mm 2/2] x86_64: Detect clock skew during suspend

On Sunday 13 August 2006 23:07, Rafael J. Wysocki wrote:
> Detect the situations in which the time after a resume from disk would
> be earlier than the time before the suspend and prevent them from
> happening on x86_64.

Merged thanks

-Andi

2006-08-14 19:58:52

by john stultz

[permalink] [raw]
Subject: Re: [RFC][PATCH -mm 1/2] i386: Detect clock skew during suspend

On Sun, 2006-08-13 at 23:06 +0200, Rafael J. Wysocki wrote:
> Detect the situations in which the time after a resume from disk would
> be earlier than the time before the suspend and prevent them from
> happening on i386.
>
> Signed-off-by: Rafael J. Wysocki <[email protected]>

One minor comment, but otherwise looks good.

> @@ -302,16 +305,25 @@ static int timer_resume(struct sys_devic
> {
> unsigned long flags;
> unsigned long sec;
> - unsigned long sleep_length;
> + unsigned long ctime = get_cmos_time();
> + long sleep_length = (ctime - sleep_start) * HZ;
> struct timespec ts;
> +
> + if (sleep_length < 0) {
> + printk(KERN_WARNING "Time skew detected in timer resume!\n");

Please make sure the warning describes the CMOS clock going backwards,
rather then just the vague "time skew detected" comment.

> + /* The time after the resume must not be earlier than the time

s/time/CMOS clock/



Acked-by: John Stultz <[email protected]>

thanks
-john

2006-08-14 20:55:52

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [RFC][PATCH -mm 1/2] i386: Detect clock skew during suspend

On Monday 14 August 2006 21:58, john stultz wrote:
> On Sun, 2006-08-13 at 23:06 +0200, Rafael J. Wysocki wrote:
> > Detect the situations in which the time after a resume from disk would
> > be earlier than the time before the suspend and prevent them from
> > happening on i386.
> >
> > Signed-off-by: Rafael J. Wysocki <[email protected]>
>
> One minor comment, but otherwise looks good.
>
> > @@ -302,16 +305,25 @@ static int timer_resume(struct sys_devic
> > {
> > unsigned long flags;
> > unsigned long sec;
> > - unsigned long sleep_length;
> > + unsigned long ctime = get_cmos_time();
> > + long sleep_length = (ctime - sleep_start) * HZ;
> > struct timespec ts;
> > +
> > + if (sleep_length < 0) {
> > + printk(KERN_WARNING "Time skew detected in timer resume!\n");
>
> Please make sure the warning describes the CMOS clock going backwards,
> rather then just the vague "time skew detected" comment.
>
> > + /* The time after the resume must not be earlier than the time
>
> s/time/CMOS clock/
>
>
>
> Acked-by: John Stultz <[email protected]>

Thanks for the comment and ACK. :-)

Second revision of the patch follows.

Greetings,
Rafael


---
Detect the situations in which the time after a resume from disk would
be earlier than the time before the suspend and prevent them from
happening on i386.

Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: John Stultz <[email protected]>
Acked-by: Pavel Machek <[email protected]>
---
arch/i386/kernel/time.c | 24 ++++++++++++++++++------
1 files changed, 18 insertions(+), 6 deletions(-)

Index: linux-2.6.18-rc4-mm1/arch/i386/kernel/time.c
===================================================================
--- linux-2.6.18-rc4-mm1.orig/arch/i386/kernel/time.c
+++ linux-2.6.18-rc4-mm1/arch/i386/kernel/time.c
@@ -285,16 +285,19 @@ void notify_arch_cmos_timer(void)
mod_timer(&sync_cmos_timer, jiffies + 1);
}

-static long clock_cmos_diff, sleep_start;
+static long clock_cmos_diff;
+static unsigned long sleep_start;

static int timer_suspend(struct sys_device *dev, pm_message_t state)
{
/*
* Estimate time zone so that set_time can update the clock
*/
- clock_cmos_diff = -get_cmos_time();
+ unsigned long ctime = get_cmos_time();
+
+ clock_cmos_diff = -ctime;
clock_cmos_diff += get_seconds();
- sleep_start = get_cmos_time();
+ sleep_start = ctime;
return 0;
}

@@ -302,16 +305,25 @@ static int timer_resume(struct sys_devic
{
unsigned long flags;
unsigned long sec;
- unsigned long sleep_length;
+ unsigned long ctime = get_cmos_time();
+ long sleep_length = (ctime - sleep_start) * HZ;
struct timespec ts;
+
+ if (sleep_length < 0) {
+ printk(KERN_WARNING "CMOS clock skew detected in timer resume!\n");
+ /* The time after the resume must not be earlier than the time
+ * before the suspend or some nasty things will happen
+ */
+ sleep_length = 0;
+ ctime = sleep_start;
+ }
#ifdef CONFIG_HPET_TIMER
if (is_hpet_enabled())
hpet_reenable();
#endif
setup_pit_timer();
- sec = get_cmos_time() + clock_cmos_diff;
- sleep_length = (get_cmos_time() - sleep_start) * HZ;

+ sec = ctime + clock_cmos_diff;
ts.tv_sec = sec;
ts.tv_nsec = 0;
do_settimeofday(&ts);