2005-01-08 09:48:19

by Nigel Cunningham

[permalink] [raw]
Subject: [RFC] Patches to reduce delay in arch/kernel/time.c

Hi all.

Over the past couple of months there's been a fair bit of discussion
regarding clock drift after suspending, big pauses during suspending and
resuming and so on. I believe this set of three patches, applied on top
of David's patch (now in Linus' tree) may address the remaining issues.
Comments please!

Patch 1: Replace multiple calls to get_cmos_time in both suspend and
resume routines with a single call. Since get_cmos_time waits for the
start of the next second before returning the result, This
reduces the delay for suspending and resuming by one second in each
case.
Patch 2: Make sleep start an unsigned long. This appears to address the
occasional 1hr 10min error seen by myself and John
(http://lkml.org/lkml/2004/12/15/290). Better solution?
Patch 3: Implement a new __get_cmos_time() function which doesn't delay
until the start of the new second before returning. Use it in suspending
(feel free to correct me, but I don't think having the
exact start of the second is necessary here).

Regards,

Nigel
--
Nigel Cunningham
Software Engineer, Canberra, Australia
http://www.cyclades.com

Ph: +61 (2) 6292 8028 Mob: +61 (417) 100 574


2005-01-08 09:48:19

by Nigel Cunningham

[permalink] [raw]
Subject: Patch 1/3: Reduce number of get_cmos_time_calls.

Reduce the number of calls to get_cmos_time. Since get_cmos_time waits
for the start of a new second, two consecutive calls add one full second
to the time to suspend/resume.

Signed-off-by: Nigel Cunningham <[email protected]>

diff -ruNp 911-old/arch/i386/kernel/time.c 911-new/arch/i386/kernel/time.c
--- 911-old/arch/i386/kernel/time.c 2005-01-08 19:36:46.107382632 +1100
+++ 911-new/arch/i386/kernel/time.c 2005-01-08 19:36:25.439524624 +1100
@@ -326,9 +326,11 @@ static int timer_suspend(struct sys_devi
/*
* Estimate time zone so that set_time can update the clock
*/
- clock_cmos_diff = -get_cmos_time();
+ long cmos_time = get_cmos_time();
+
+ clock_cmos_diff = -cmos_time;
clock_cmos_diff += get_seconds();
- sleep_start = get_cmos_time();
+ sleep_start = cmos_time;
return 0;
}

@@ -337,13 +339,15 @@ static int timer_resume(struct sys_devic
unsigned long flags;
unsigned long sec;
unsigned long sleep_length;
+ unsigned long cmos_time;

#ifdef CONFIG_HPET_TIMER
if (is_hpet_enabled())
hpet_reenable();
#endif
- sec = get_cmos_time() + clock_cmos_diff;
- sleep_length = (get_cmos_time() - sleep_start) * HZ;
+ cmos_time = get_cmos_time();
+ sec = cmos_time + clock_cmos_diff;
+ sleep_length = (cmos_time - sleep_start) * HZ;
write_seqlock_irqsave(&xtime_lock, flags);
xtime.tv_sec = sec;
xtime.tv_nsec = 0;


2005-01-08 09:48:17

by Nigel Cunningham

[permalink] [raw]
Subject: Patch 2/3: Reduce number of get_cmos_time_calls.

Change sleep_start from signed to unsigned long. This appears to address
an issue with the clock being occasionally off by around 1 hr 10
minutes.

Signed-off-by: Nigel Cunningham <[email protected]>

diff -ruNp 912-old/arch/i386/kernel/time.c 912-new/arch/i386/kernel/time.c
--- 912-old/arch/i386/kernel/time.c 2005-01-08 19:38:33.786012992 +1100
+++ 912-new/arch/i386/kernel/time.c 2005-01-08 19:38:15.058859952 +1100
@@ -319,7 +319,8 @@ unsigned long get_cmos_time(void)
return retval;
}

-static long clock_cmos_diff, sleep_start;
+static long clock_cmos_diff;
+static unsigned long sleep_start;

static int timer_suspend(struct sys_device *dev, u32 state)
{


2005-01-08 09:48:13

by Nigel Cunningham

[permalink] [raw]
Subject: Patch 3/3: Reduce number of get_cmos_time_calls.

Create new __get_cmos_time patch, which doesn't wait for the start of a
new second before returning. Adjust timer_suspend to use this as we
don't appear to need the exact start of a second when suspending.

Signed-off-by: Nigel Cunningham <[email protected]>

diff -ruNp 913-old/arch/i386/kernel/time.c 913-new/arch/i386/kernel/time.c
--- 913-old/arch/i386/kernel/time.c 2005-01-08 19:40:06.149971536 +1100
+++ 913-new/arch/i386/kernel/time.c 2005-01-08 19:39:51.915135560 +1100
@@ -327,7 +327,7 @@ static int timer_suspend(struct sys_devi
/*
* Estimate time zone so that set_time can update the clock
*/
- long cmos_time = get_cmos_time();
+ long cmos_time = __get_cmos_time();

clock_cmos_diff = -cmos_time;
clock_cmos_diff += get_seconds();
diff -ruNp 913-old/arch/x86_64/kernel/time.c 913-new/arch/x86_64/kernel/time.c
--- 913-old/arch/x86_64/kernel/time.c 2004-12-10 14:27:08.000000000 +1100
+++ 913-new/arch/x86_64/kernel/time.c 2005-01-08 19:39:24.664278320 +1100
@@ -499,11 +499,56 @@ unsigned long long sched_clock(void)
return cycles_2_ns(a);
}

+unsigned long __get_cmos_time(void)
+{
+ unsigned int year, mon, day, hour, min, sec;
+
+ /*
+ * Do we need the spinlock in here too?
+ *
+ * If we're called directly (not via get_cmos_time),
+ * we're in the middle of a sysdev suspend/resume
+ * and interrupts are disabled, so this
+ * should be safe without any locking.
+ * -- NC
+ */
+
+ do {
+ sec = CMOS_READ(RTC_SECONDS);
+ min = CMOS_READ(RTC_MINUTES);
+ hour = CMOS_READ(RTC_HOURS);
+ day = CMOS_READ(RTC_DAY_OF_MONTH);
+ mon = CMOS_READ(RTC_MONTH);
+ year = CMOS_READ(RTC_YEAR);
+ } while (sec != CMOS_READ(RTC_SECONDS));
+
+ /*
+ * We know that x86-64 always uses BCD format, no need to check the config
+ * register.
+ */
+
+ BCD_TO_BIN(sec);
+ BCD_TO_BIN(min);
+ BCD_TO_BIN(hour);
+ BCD_TO_BIN(day);
+ BCD_TO_BIN(mon);
+ BCD_TO_BIN(year);
+
+ /*
+ * This will work up to Dec 31, 2069.
+ */
+
+ if ((year += 1900) < 1970)
+ year += 100;
+
+ return mktime(year, mon, day, hour, min, sec);
+}
+
unsigned long get_cmos_time(void)
{
- unsigned int timeout, year, mon, day, hour, min, sec;
+ unsigned int timeout;
unsigned char last, this;
- unsigned long flags;
+ unsigned long flags, result;

/*
* The Linux interpretation of the CMOS clock register contents: When the
@@ -524,40 +569,11 @@ unsigned long get_cmos_time(void)
timeout--;
}

-/*
- * Here we are safe to assume the registers won't change for a whole second, so
- * we just go ahead and read them.
- */
-
- sec = CMOS_READ(RTC_SECONDS);
- min = CMOS_READ(RTC_MINUTES);
- hour = CMOS_READ(RTC_HOURS);
- day = CMOS_READ(RTC_DAY_OF_MONTH);
- mon = CMOS_READ(RTC_MONTH);
- year = CMOS_READ(RTC_YEAR);
-
+ result = __get_cmos_time();
spin_unlock_irqrestore(&rtc_lock, flags);

-/*
- * We know that x86-64 always uses BCD format, no need to check the config
- * register.
- */
-
- BCD_TO_BIN(sec);
- BCD_TO_BIN(min);
- BCD_TO_BIN(hour);
- BCD_TO_BIN(day);
- BCD_TO_BIN(mon);
- BCD_TO_BIN(year);
+ return result;

-/*
- * This will work up to Dec 31, 2069.
- */
-
- if ((year += 1900) < 1970)
- year += 100;
-
- return mktime(year, mon, day, hour, min, sec);
}

#ifdef CONFIG_CPU_FREQ
@@ -962,7 +978,7 @@ static int timer_suspend(struct sys_devi
/*
* Estimate time zone so that set_time can update the clock
*/
- clock_cmos_diff = -get_cmos_time();
+ clock_cmos_diff = -__get_cmos_time();
clock_cmos_diff += get_seconds();
return 0;
}
diff -ruNp 913-old/include/asm-i386/mach-default/mach_time.h 913-new/include/asm-i386/mach-default/mach_time.h
--- 913-old/include/asm-i386/mach-default/mach_time.h 2004-11-03 21:53:12.000000000 +1100
+++ 913-new/include/asm-i386/mach-default/mach_time.h 2005-01-08 19:39:24.676276496 +1100
@@ -79,24 +79,19 @@ static inline int mach_set_rtc_mmss(unsi
return retval;
}

-static inline unsigned long mach_get_cmos_time(void)
+/* __get_cmos_time
+ *
+ * Separated out from mach_get_cmos_time so that we can
+ * quickly get the cmos time when we don't care about
+ * whether the second has just started.
+ *
+ * Used from suspend and resume sysdev calls.
+ */
+static inline unsigned long __get_cmos_time(void)
{
unsigned int year, mon, day, hour, min, sec;
- int i;

- /* The Linux interpretation of the CMOS clock register contents:
- * When the Update-In-Progress (UIP) flag goes from 1 to 0, the
- * RTC registers show the second which has precisely just started.
- * Let's hope other operating systems interpret the RTC the same way.
- */
- /* read RTC exactly on falling edge of update flag */
- for (i = 0 ; i < 1000000 ; i++) /* may take up to 1 second... */
- if (CMOS_READ(RTC_FREQ_SELECT) & RTC_UIP)
- break;
- for (i = 0 ; i < 1000000 ; i++) /* must try at least 2.228 ms */
- if (!(CMOS_READ(RTC_FREQ_SELECT) & RTC_UIP))
- break;
- do { /* Isn't this overkill ? UIP above should guarantee consistency */
+ do {
sec = CMOS_READ(RTC_SECONDS);
min = CMOS_READ(RTC_MINUTES);
hour = CMOS_READ(RTC_HOURS);
@@ -104,6 +99,7 @@ static inline unsigned long mach_get_cmo
mon = CMOS_READ(RTC_MONTH);
year = CMOS_READ(RTC_YEAR);
} while (sec != CMOS_READ(RTC_SECONDS));
+
if (!(CMOS_READ(RTC_CONTROL) & RTC_DM_BINARY) || RTC_ALWAYS_BCD)
{
BCD_TO_BIN(sec);
@@ -119,4 +115,24 @@ static inline unsigned long mach_get_cmo
return mktime(year, mon, day, hour, min, sec);
}

+static inline unsigned long mach_get_cmos_time(void)
+{
+ int i;
+
+ /* The Linux interpretation of the CMOS clock register contents:
+ * When the Update-In-Progress (UIP) flag goes from 1 to 0, the
+ * RTC registers show the second which has precisely just started.
+ * Let's hope other operating systems interpret the RTC the same way.
+ */
+ /* read RTC exactly on falling edge of update flag */
+ for (i = 0 ; i < 1000000 ; i++) /* may take up to 1 second... */
+ if (CMOS_READ(RTC_FREQ_SELECT) & RTC_UIP)
+ break;
+ for (i = 0 ; i < 1000000 ; i++) /* must try at least 2.228 ms */
+ if (!(CMOS_READ(RTC_FREQ_SELECT) & RTC_UIP))
+ break;
+
+ return __get_cmos_time();
+}
+
#endif /* !_MACH_TIME_H */


2005-01-08 13:21:35

by Pavel Machek

[permalink] [raw]
Subject: Re: Patch 1/3: Reduce number of get_cmos_time_calls.

Hi!

> Reduce the number of calls to get_cmos_time. Since get_cmos_time waits
> for the start of a new second, two consecutive calls add one full second
> to the time to suspend/resume.

Ack. (You probably want to send it to akpm...?). I did not realize
that get_cmos_time() is so costly.
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2005-01-08 13:23:52

by Pavel Machek

[permalink] [raw]
Subject: Re: Patch 2/3: Reduce number of get_cmos_time_calls.

Hi!

> Change sleep_start from signed to unsigned long. This appears to address
> an issue with the clock being occasionally off by around 1 hr 10
> minutes.

sleep_start is in seconds, right? I do not see how it could explain
1hr10 difference... Please do not apply this.

OTOH try this code (in userspace, run as root). It did strange thigs
to my time, and it is unrelated to swsusp. (But it does similar weird
things with interrupts).

pavel@amd:~$ cat misc/latency.c
void
main(void)
{
int i;
iopl(3);
while (1) {
asm volatile("cli");
// for (i=0; i<20000000; i++)
for (i=0; i<1000000000; i++)
asm volatile("");
asm volatile("sti");
sleep(1);
}
}
pavel@amd:~$

Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2005-01-08 13:28:44

by Pavel Machek

[permalink] [raw]
Subject: Re: Patch 3/3: Reduce number of get_cmos_time_calls.

Hi!

> Create new __get_cmos_time patch, which doesn't wait for the start of a
> new second before returning. Adjust timer_suspend to use this as we
> don't appear to need the exact start of a second when suspending.

Basically nice cleanup. I do not know if this does not mean up-to
second error in clock for each suspend/resume?

> --- 913-old/arch/x86_64/kernel/time.c 2004-12-10 14:27:08.000000000 +1100
> +++ 913-new/arch/x86_64/kernel/time.c 2005-01-08 19:39:24.664278320 +1100
> @@ -499,11 +499,56 @@ unsigned long long sched_clock(void)
> return cycles_2_ns(a);
> }
>
> +unsigned long __get_cmos_time(void)
> +{

Missing static?

> +
> + /*
> + * Do we need the spinlock in here too?
> + *
> + * If we're called directly (not via get_cmos_time),
> + * we're in the middle of a sysdev suspend/resume
> + * and interrupts are disabled, so this
> + * should be safe without any locking.
> + * -- NC
> + */

I'd say "Caller is responsible for locking"... and explain this in
caller. Also do not sign comments.

> + do {
> + sec = CMOS_READ(RTC_SECONDS);
> + min = CMOS_READ(RTC_MINUTES);
> + hour = CMOS_READ(RTC_HOURS);
> + day = CMOS_READ(RTC_DAY_OF_MONTH);
> + mon = CMOS_READ(RTC_MONTH);
> + year = CMOS_READ(RTC_YEAR);
> + } while (sec != CMOS_READ(RTC_SECONDS));
> +
> + /*
> + * We know that x86-64 always uses BCD format, no need to check the config
> + * register.
> + */
> +
> + BCD_TO_BIN(sec);
> + BCD_TO_BIN(min);
> + BCD_TO_BIN(hour);
> + BCD_TO_BIN(day);
> + BCD_TO_BIN(mon);
> + BCD_TO_BIN(year);

Whitespace damage?
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!