2008-06-25 12:04:33

by Mayank Sharma

[permalink] [raw]
Subject: Time drifting after multiple sleep/wakeup in timekeeping

Hi,

I noticed a bug with respect to time drifting after multiple sleep/wakeup sequence. We have an embedded ARM11 based platform on which we have successfuly ported Linux. We also have a RTC on board. Hence we have implemented the read_persistent_clock() function overriding the one defined in kernel/time/timekeeping.c. What we observed was that after doing multiple sleep/wakeup sequences, the time reported by RTC and gettimeofday was drifting. After about 10 iterations the gettimeofday was lagging by about one second. Subsequently the lag only increased.

What looks to me is that in the timekeeping_resume function we are adding the number of seconds we have been sleeping to adjust the new time. But since we are adding only the seconds slept the update is only second level accurate. read_persistent_clock gives a second level granulaity, and hence we cannot help that. Hence after one sleep/wake sequence the gettimeoday would have lagged by delta (where delta is less than a second). On multiple such iterations the delta keeps adding up, becoming a second and thereafter we see a drift of more than a second.

If however we set the gettimeofday (xtime) to the RTC time on wakeup (Just like we do in timekeeping_init()) instead of just adding the sleep time, the drift will not accumulate. I am using the patch mentioned in the end of the mail to fix this issue. Let me know if this is a valid patch.

Regards,
Mayank

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index e91c29f..6edf37f 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -288,12 +288,19 @@ static int timekeeping_resume(struct sys_device *dev)
if (now && (now > timekeeping_suspend_time)) {
unsigned long sleep_length = now - timekeeping_suspend_time;

- xtime.tv_sec += sleep_length;
+ /* Syncronize the xtime with the rtc as is done during init. This
+ * ensures that drift is not accumulated while sleeping and waking
+ * multiple times
+ */
+ xtime.tv_sec = now;
+ xtime.tv_nsec = 0;
wall_to_monotonic.tv_sec -= sleep_length;
total_sleep_time += sleep_length;
}
/* Make sure that we have the correct xtime reference */
- timespec_add_ns(&xtime, timekeeping_suspend_nsecs);
+ else {
+ timespec_add_ns(&xtime, timekeeping_suspend_nsecs);
+ }
update_xtime_cache(0);
/* re-base the last cycle value */
clock->cycle_last = 0;


2008-06-26 06:47:46

by Bart Van Assche

[permalink] [raw]
Subject: Re: Time drifting after multiple sleep/wakeup in timekeeping

On Wed, Jun 25, 2008 at 1:48 PM, Mayank Sharma <[email protected]> wrote:
> I noticed a bug with respect to time drifting after multiple sleep/wakeup sequence.

I suggest that you CC at least one ARM-specific mailing list, and also
that you specify which kernel version the original behavior was
observed on. Were you working with a vanilla Linux kernel or a patched
one ?

Bart.

2008-06-26 08:00:18

by Mayank Sharma

[permalink] [raw]
Subject: RE: Time drifting after multiple sleep/wakeup in timekeeping

Hi Bart,

I have observed this behaviour on 2.6.23-17. The diff in my earlier mail was with the latest kernel.

I am cc'ing linux-arm on this mail. In my opinion the problem was not restrictive to ARM and hence I posted this message in the linux-kernel list.

-Mayank

-----Original Message-----
From: Bart Van Assche [mailto:[email protected]]
Sent: Thursday, June 26, 2008 12:18 PM
To: Mayank Sharma
Cc: lkml
Subject: Re: Time drifting after multiple sleep/wakeup in timekeeping

On Wed, Jun 25, 2008 at 1:48 PM, Mayank Sharma <[email protected]> wrote:
> I noticed a bug with respect to time drifting after multiple sleep/wakeup sequence.

I suggest that you CC at least one ARM-specific mailing list, and also that you specify which kernel version the original behavior was observed on. Were you working with a vanilla Linux kernel or a patched one ?

Bart.

Forwarded Message for linux-arm mailing list
Hi,

I noticed a bug with respect to time drifting after multiple sleep/wakeup sequence. We have an embedded ARM11 based platform on which we have successfuly ported Linux. We also have a RTC on board. Hence we have implemented the read_persistent_clock() function overriding the one defined in kernel/time/timekeeping.c. What we observed was that after doing multiple sleep/wakeup sequences, the time reported by RTC and gettimeofday was drifting. After about 10 iterations the gettimeofday was lagging by about one second. Subsequently the lag only increased.

What looks to me is that in the timekeeping_resume function we are adding the number of seconds we have been sleeping to adjust the new time. But since we are adding only the seconds slept the update is only second level accurate. read_persistent_clock gives a second level granulaity, and hence we cannot help that. Hence after one sleep/wake sequence the gettimeoday would have lagged by delta (where delta is less than a second). On multiple such iterations the delta keeps adding up, becoming a second and thereafter we see a drift of more than a second.

If however we set the gettimeofday (xtime) to the RTC time on wakeup (Just like we do in timekeeping_init()) instead of just adding the sleep time, the drift will not accumulate. I am using the patch mentioned in the end of the mail to fix this issue. Let me know if this is a valid patch.

Regards,
Mayank

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index e91c29f..6edf37f 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -288,12 +288,19 @@ static int timekeeping_resume(struct sys_device *dev)
if (now && (now > timekeeping_suspend_time)) {
unsigned long sleep_length = now - timekeeping_suspend_time;

- xtime.tv_sec += sleep_length;
+ /* Syncronize the xtime with the rtc as is done during init. This
+ * ensures that drift is not accumulated while sleeping and waking
+ * multiple times
+ */
+ xtime.tv_sec = now;
+ xtime.tv_nsec = 0;
wall_to_monotonic.tv_sec -= sleep_length;
total_sleep_time += sleep_length;
}
/* Make sure that we have the correct xtime reference */
- timespec_add_ns(&xtime, timekeeping_suspend_nsecs);
+ else {
+ timespec_add_ns(&xtime, timekeeping_suspend_nsecs);
+ }
update_xtime_cache(0);
/* re-base the last cycle value */
clock->cycle_last = 0;

2008-06-27 11:06:18

by Bart Van Assche

[permalink] [raw]
Subject: Re: Time drifting after multiple sleep/wakeup in timekeeping

On Thu, Jun 26, 2008 at 10:00 AM, Mayank Sharma <[email protected]> wrote:
> I have observed this behaviour on 2.6.23-17. The diff in my earlier mail was with the latest kernel.
>
> I am cc'ing linux-arm on this mail. In my opinion the problem was not restrictive to ARM and hence I posted this message in the linux-kernel list.

Have you already been able to verify this ?

Bart.

2008-06-27 11:35:46

by Mayank Sharma

[permalink] [raw]
Subject: RE: Time drifting after multiple sleep/wakeup in timekeeping

You mean on non-ARM platforms ? No I have not verified it

-Mayank

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Bart Van Assche
Sent: Friday, June 27, 2008 4:36 PM
To: Mayank Sharma
Cc: lkml; [email protected]
Subject: Re: Time drifting after multiple sleep/wakeup in timekeeping

On Thu, Jun 26, 2008 at 10:00 AM, Mayank Sharma <[email protected]> wrote:
> I have observed this behaviour on 2.6.23-17. The diff in my earlier mail was with the latest kernel.
>
> I am cc'ing linux-arm on this mail. In my opinion the problem was not restrictive to ARM and hence I posted this message in the linux-kernel list.

Have you already been able to verify this ?

Bart.

2008-06-27 12:21:01

by Pavel Machek

[permalink] [raw]
Subject: Re: Time drifting after multiple sleep/wakeup in timekeeping

On Wed 2008-06-25 17:18:52, Mayank Sharma wrote:
> Hi,
>
> I noticed a bug with respect to time drifting after multiple sleep/wakeup sequence. We have an embedded ARM11 based platform on which we have successfuly ported Linux. We also have a RTC on board. Hence we have implemented the read_persistent_clock() function overriding the one defined in kernel/time/timekeeping.c. What we observed was that after doing multiple sleep/wakeup sequences, the time reported by RTC and gettimeofday was drifting. After about 10 iterations the gettimeofday was lagging by about one second. Subsequently the lag only increased.
>

Hmm, should I look forward for linux-based gps?

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2008-06-27 12:20:11

by Pavel Machek

[permalink] [raw]
Subject: Re: Time drifting after multiple sleep/wakeup in timekeeping


On Wed 2008-06-25 17:18:52, Mayank Sharma wrote:
> Hi,
>
> I noticed a bug with respect to time drifting after multiple sleep/wakeup sequence. We have an embedded ARM11 based platform on which we have successfuly ported Linux. We also have a RTC on board. Hence we have implemented the read_persistent_clock() function overriding the one defined in kernel/time/timekeeping.c. What we observed was that after doing multiple sleep/wakeup sequences, the time reported by RTC and gettimeofday was drifting. After about 10 iterations the gettimeofday was lagging by about one second. Subsequently the lag only increased.
>
> What looks to me is that in the timekeeping_resume function we are adding the number of seconds we have been sleeping to adjust the new time. But since we are adding only the seconds slept the update is only second level accurate. read_persistent_clock gives a second level granulaity, and hence we cannot help that. Hence after one sleep/wake sequence the gettimeoday would have lagged by delta (where delta is less than a second). On multiple such iterations the delta keeps adding up, becoming a second and thereafter we see a drift of more than a second.
>
> If however we set the gettimeofday (xtime) to the RTC time on wakeup (Just like we do in timekeeping_init()) instead of just adding the sleep time, the drift will not accumulate. I am using the patch mentioned in the end of the mail to fix this issue. Let me know if this is a valid patch.
>
> Regards,
> Mayank
>
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index e91c29f..6edf37f 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -288,12 +288,19 @@ static int timekeeping_resume(struct sys_device *dev)
> if (now && (now > timekeeping_suspend_time)) {
> unsigned long sleep_length = now - timekeeping_suspend_time;
>
> - xtime.tv_sec += sleep_length;
> + /* Syncronize the xtime with the rtc as is done during init. This
> + * ensures that drift is not accumulated while sleeping and waking
> + * multiple times
> + */
> + xtime.tv_sec = now;
> + xtime.tv_nsec = 0;

Is it possible that this removes offset between rtc and system clock?

Added rafael to cc, I guess you should add time maintainers (tglx?)
too...
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2008-07-10 22:09:36

by john stultz

[permalink] [raw]
Subject: Re: Time drifting after multiple sleep/wakeup in timekeeping

On Wed, Jun 25, 2008 at 4:48 AM, Mayank Sharma <[email protected]> wrote:
> I noticed a bug with respect to time drifting after multiple sleep/wakeup sequence. We have an embedded ARM11 based platform on which we have successfuly ported Linux. We also have a RTC on board. Hence we have implemented the read_persistent_clock() function overriding the one defined in kernel/time/timekeeping.c. What we observed was that after doing multiple sleep/wakeup sequences, the time reported by RTC and gettimeofday was drifting. After about 10 iterations the gettimeofday was lagging by about one second. Subsequently the lag only increased.
>
> What looks to me is that in the timekeeping_resume function we are adding the number of seconds we have been sleeping to adjust the new time. But since we are adding only the seconds slept the update is only second level accurate. read_persistent_clock gives a second level granulaity, and hence we cannot help that. Hence after one sleep/wake sequence the gettimeoday would have lagged by delta (where delta is less than a second). On multiple such iterations the delta keeps adding up, becoming a second and thereafter we see a drift of more than a second.
>
> If however we set the gettimeofday (xtime) to the RTC time on wakeup (Just like we do in timekeeping_init()) instead of just adding the sleep time, the drift will not accumulate. I am using the patch mentioned in the end of the mail to fix this issue. Let me know if this is a valid patch.
>
> Regards,
> Mayank
>
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index e91c29f..6edf37f 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -288,12 +288,19 @@ static int timekeeping_resume(struct sys_device *dev)
> if (now && (now > timekeeping_suspend_time)) {
> unsigned long sleep_length = now - timekeeping_suspend_time;
>
> - xtime.tv_sec += sleep_length;
> + /* Syncronize the xtime with the rtc as is done during init. This
> + * ensures that drift is not accumulated while sleeping and waking
> + * multiple times
> + */
> + xtime.tv_sec = now;
> + xtime.tv_nsec = 0;

This would only be better if we are sure the persistent clock is NTP
synced (which it may not be) and it also waits for a second boundary
to return. On x86 I know the stall-for-a-second-boundary trick was
removed because it would add an extra 1sec delay to the suspend/resume
time.

Additionally Mixing the above with the below could cause the monotonic
clock to see inconsistencies.

> wall_to_monotonic.tv_sec -= sleep_length;
> total_sleep_time += sleep_length;
> }
> /* Make sure that we have the correct xtime reference */
> - timespec_add_ns(&xtime, timekeeping_suspend_nsecs);
> + else {
> + timespec_add_ns(&xtime, timekeeping_suspend_nsecs);
> + }
> update_xtime_cache(0);
> /* re-base the last cycle value */
> clock->cycle_last = 0;

So instead, I'd suggest extending the persistent_clock interface to
support/return nanoseconds, so the delta can be more precise. This
won't work on all hardware (since not all systems have nanosecond
resolution rtcs) but avoids any delays trying to only return on second
boundaries, etc.

thanks
-john