2011-06-01 06:07:59

by John Stultz

[permalink] [raw]
Subject: [RFC][PATCH 0/2] Avoid accumulating drift in suspend/resume

Arve Hj?nnev?g noted that in the suspend/resume path, we're likely to see
half-second errors from each read of the RTC. If a system is frequently
suspended, these errors will accumulate quickly.

Arve's solution was to compare the time delta between the system time
and the RTC, for each suspend. If the difference is small, use the same
delta for each suspend. This consistency avoids the error from accumulating.

This patch set implements Arve's suggestion for both the RTC and persistent
clock suspend paths.

Initial tests show that this improves time accuracy over many repeated
suspends. So while testing continues, I just wanted to send this out for
review and feedback.

thanks
-john

CC: Arve Hj?nnev?g <[email protected]>
CC: Thomas Gleixner <[email protected]>

John Stultz (2):
time: Avoid accumulating time drift in suspend/resume
rtc: Avoid accumulating time drift in suspend/resume

drivers/rtc/class.c | 65 +++++++++++++++++++++++++++++++++------------
kernel/time/timekeeping.c | 22 +++++++++++++++
2 files changed, 70 insertions(+), 17 deletions(-)

--
1.7.3.2.146.gca209


2011-06-01 06:08:02

by John Stultz

[permalink] [raw]
Subject: [PATCH 1/2] time: Avoid accumulating time drift in suspend/resume

Because the read_persistent_clock interface is usually backed by
only a second granular interface, each time we read from the persistent
clock for suspend/resume, we introduce a half second (on average) of error.

In order to avoid this error accumulating as the system is suspended
over and over, this patch measures the time delta between the persistent
clock and the system CLOCK_REALTIME.

If the delta is less then 2 seconds from the last suspend, we compensate
by using the previous time delta (keeping it close). If it is larger
then 2 seconds, we assume the clock was set or has been changed, so we
do no correction and update the delta.

Note: If NTP is running, ths could seem to "fight" with the NTP corrected
time, where as if the system time was off by 1 second, and NTP slewed the
value in, a suspend/resume cycle could undo this correction, by trying to
restore the previous offset from the persistent clock. However, without
this patch, since each read could cause almost a full second worth of
error, its possible to get almost 2 seconds of error just from the
suspend/resume cycle alone, so this about equal to any offset added by
the compensation.

Further on systems that suspend/resume frequently, this should keep time
closer then NTP could compensate for if the errors were allowed to
accumulate.

Credits to Arve Hjønnevåg for suggesting this solution.

CC: Arve HjønnevÃ¥g <[email protected]>
CC: Thomas Gleixner <[email protected]>
Signed-off-by: John Stultz <[email protected]>
---
kernel/time/timekeeping.c | 22 ++++++++++++++++++++++
1 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 342408c..7665fea 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -686,12 +686,34 @@ static void timekeeping_resume(void)
static int timekeeping_suspend(void)
{
unsigned long flags;
+ struct timespec delta, delta_delta;
+ static struct timespec old_delta;

read_persistent_clock(&timekeeping_suspend_time);

write_seqlock_irqsave(&xtime_lock, flags);
timekeeping_forward_now();
timekeeping_suspended = 1;
+
+ /*
+ * To avoid drift caused by repeated suspend/resumes,
+ * which each can add ~1 second drift error,
+ * try to compensate so the difference in system time
+ * and persistent_clock time stays close to constant.
+ */
+ delta = timespec_sub(xtime, timekeeping_suspend_time);
+ delta_delta = timespec_sub(delta, old_delta);
+ if (abs(delta_delta.tv_sec) >= 2) {
+ /*
+ * if delta_delta is too large, assume time correction
+ * has occured and set old_delta to the current delta.
+ */
+ old_delta = delta;
+ } else {
+ /* Otherwise try to adjust old_system to compensate */
+ timekeeping_suspend_time =
+ timespec_add(timekeeping_suspend_time, delta_delta);
+ }
write_sequnlock_irqrestore(&xtime_lock, flags);

clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL);
--
1.7.3.2.146.gca209

2011-06-01 06:08:23

by John Stultz

[permalink] [raw]
Subject: [PATCH 2/2] rtc: Avoid accumulating time drift in suspend/resume

Because the RTC interface is only a second granular interface,
each time we read from the RTC for suspend/resume, we introduce a
half second (on average) of error.

In order to avoid this error accumulating as the system is suspended
over and over, this patch measures the time delta between the RTC
and the system CLOCK_REALTIME.

If the delta is less then 2 seconds from the last suspend, we compensate
by using the previous time delta (keeping it close). If it is larger
then 2 seconds, we assume the clock was set or has been changed, so we
do no correction and update the delta.

Note: If NTP is running, ths could seem to "fight" with the NTP corrected
time, where as if the system time was off by 1 second, and NTP slewed the
value in, a suspend/resume cycle could undo this correction, by trying to
restore the previous offset from the RTC. However, without this patch,
since each read could cause almost a full second worth of error, its
possible to get almost 2 seconds of error just from the suspend/resume
cycle alone, so this about equal to any offset added by the compensation.

Further on systems that suspend/resume frequently, this should keep time
closer then NTP could compensate for if the errors were allowed to
accumulate.

Credits to Arve Hjønnevåg for suggesting this solution.

This patch also improves some of the variable names and adds more clear
comments.

CC: Arve Hjønnevåg <[email protected]>
CC: Thomas Gleixner <[email protected]>
Signed-off-by: John Stultz <[email protected]>
---
drivers/rtc/class.c | 65 +++++++++++++++++++++++++++++++++++++-------------
1 files changed, 48 insertions(+), 17 deletions(-)

diff --git a/drivers/rtc/class.c b/drivers/rtc/class.c
index 4194e59..a619228 100644
--- a/drivers/rtc/class.c
+++ b/drivers/rtc/class.c
@@ -41,20 +41,41 @@ static void rtc_device_release(struct device *dev)
* system's wall clock; restore it on resume().
*/

-static time_t oldtime;
-static struct timespec oldts;
+static struct timespec old_rtc, old_system, old_delta;
+

static int rtc_suspend(struct device *dev, pm_message_t mesg)
{
struct rtc_device *rtc = to_rtc_device(dev);
struct rtc_time tm;
-
+ struct timespec delta, delta_delta;
if (strcmp(dev_name(&rtc->dev), CONFIG_RTC_HCTOSYS_DEVICE) != 0)
return 0;

+ /* snapshot the current RTC and system time at suspend*/
rtc_read_time(rtc, &tm);
- ktime_get_ts(&oldts);
- rtc_tm_to_time(&tm, &oldtime);
+ getnstimeofday(&old_system);
+ rtc_tm_to_time(&tm, &old_rtc.tv_sec);
+
+
+ /*
+ * To avoid drift caused by repeated suspend/resumes,
+ * which each can add ~1 second drift error,
+ * try to compensate so the difference in system time
+ * and rtc time stays close to constant.
+ */
+ delta = timespec_sub(old_system, old_rtc);
+ delta_delta = timespec_sub(delta, old_delta);
+ if (abs(delta_delta.tv_sec) >= 2) {
+ /*
+ * if delta_delta is too large, assume time correction
+ * has occured and set old_delta to the current delta.
+ */
+ old_delta = delta;
+ } else {
+ /* Otherwise try to adjust old_system to compensate */
+ old_system = timespec_sub(old_system, delta_delta);
+ }

return 0;
}
@@ -63,32 +84,42 @@ static int rtc_resume(struct device *dev)
{
struct rtc_device *rtc = to_rtc_device(dev);
struct rtc_time tm;
- time_t newtime;
- struct timespec time;
- struct timespec newts;
+ struct timespec new_system, new_rtc;
+ struct timespec sleep_time;

if (strcmp(dev_name(&rtc->dev), CONFIG_RTC_HCTOSYS_DEVICE) != 0)
return 0;

- ktime_get_ts(&newts);
+ /* snapshot the current rtc and system time at resume */
+ getnstimeofday(&new_system);
rtc_read_time(rtc, &tm);
if (rtc_valid_tm(&tm) != 0) {
pr_debug("%s: bogus resume time\n", dev_name(&rtc->dev));
return 0;
}
- rtc_tm_to_time(&tm, &newtime);
- if (newtime <= oldtime) {
- if (newtime < oldtime)
+ rtc_tm_to_time(&tm, &new_rtc.tv_sec);
+ new_rtc.tv_nsec = 0;
+
+ if (new_rtc.tv_sec <= old_rtc.tv_sec) {
+ if (new_rtc.tv_sec < old_rtc.tv_sec)
pr_debug("%s: time travel!\n", dev_name(&rtc->dev));
return 0;
}
- /* calculate the RTC time delta */
- set_normalized_timespec(&time, newtime - oldtime, 0);

- /* subtract kernel time between rtc_suspend to rtc_resume */
- time = timespec_sub(time, timespec_sub(newts, oldts));
+ /* calculate the RTC time delta (sleep time)*/
+ sleep_time = timespec_sub(new_rtc, old_rtc);
+
+ /*
+ * Since these RTC suspend/resume handlers are not called
+ * at the very end of suspend or the start of resume,
+ * some run-time may pass on either sides of the sleep time
+ * so subtract kernel run-time between rtc_suspend to rtc_resume
+ * to keep things accurate.
+ */
+ sleep_time = timespec_sub(sleep_time,
+ timespec_sub(new_system, old_system));

- timekeeping_inject_sleeptime(&time);
+ timekeeping_inject_sleeptime(&sleep_time);
return 0;
}

--
1.7.3.2.146.gca209

2011-06-02 00:54:48

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: [PATCH 2/2] rtc: Avoid accumulating time drift in suspend/resume

On Tue, May 31, 2011 at 11:07 PM, John Stultz <[email protected]> wrote:
> Because the RTC interface is only a second granular interface,
> each time we read from the RTC for suspend/resume, we introduce a
> half second (on average) of error.
>
> In order to avoid this error accumulating as the system is suspended
> over and over, this patch measures the time delta between the RTC
> and the system CLOCK_REALTIME.
>
> If the delta is less then 2 seconds from the last suspend, we compensate
> by using the previous time delta (keeping it close). If it is larger
> then 2 seconds, we assume the clock was set or has been changed, so we
> do no correction and update the delta.
>
> Note: If NTP is running, ths could seem to "fight" with the NTP corrected
> time, where as if the system time was off by 1 second, and NTP slewed the
> value in, a suspend/resume cycle could undo this correction, by trying to
> restore the previous offset from the RTC. However, without this patch,
> since each read could cause almost a full second worth of error, its
> possible to get almost 2 seconds of error just from the suspend/resume
> cycle alone, so this about equal to any offset added by the compensation.
>
> Further on systems that suspend/resume frequently, this should keep time
> closer then NTP could compensate for if the errors were allowed to
> accumulate.
>
> Credits to Arve Hj?nnev?g for suggesting this solution.
>
> This patch also improves some of the variable names and adds more clear
> comments.
>
> CC: Arve Hj?nnev?g <[email protected]>
> CC: Thomas Gleixner <[email protected]>
> Signed-off-by: John Stultz <[email protected]>
> ---
> ?drivers/rtc/class.c | ? 65 +++++++++++++++++++++++++++++++++++++-------------
> ?1 files changed, 48 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/rtc/class.c b/drivers/rtc/class.c
> index 4194e59..a619228 100644
> --- a/drivers/rtc/class.c
> +++ b/drivers/rtc/class.c
> @@ -41,20 +41,41 @@ static void rtc_device_release(struct device *dev)
> ?* system's wall clock; restore it on resume().
> ?*/
>
> -static time_t ? ? ? ? ?oldtime;
> -static struct timespec oldts;
> +static struct timespec old_rtc, old_system, old_delta;
> +
>
> ?static int rtc_suspend(struct device *dev, pm_message_t mesg)
> ?{
> ? ? ? ?struct rtc_device ? ? ? *rtc = to_rtc_device(dev);
> ? ? ? ?struct rtc_time ? ? ? ? tm;
> -
> + ? ? ? struct timespec ? ? ? ? delta, delta_delta;
> ? ? ? ?if (strcmp(dev_name(&rtc->dev), CONFIG_RTC_HCTOSYS_DEVICE) != 0)
> ? ? ? ? ? ? ? ?return 0;
>
> + ? ? ? /* snapshot the current RTC and system time at suspend*/
> ? ? ? ?rtc_read_time(rtc, &tm);
> - ? ? ? ktime_get_ts(&oldts);
> - ? ? ? rtc_tm_to_time(&tm, &oldtime);
> + ? ? ? getnstimeofday(&old_system);
> + ? ? ? rtc_tm_to_time(&tm, &old_rtc.tv_sec);
> +
> +
> + ? ? ? /*
> + ? ? ? ?* To avoid drift caused by repeated suspend/resumes,
> + ? ? ? ?* which each can add ~1 second drift error,
> + ? ? ? ?* try to compensate so the difference in system time
> + ? ? ? ?* and rtc time stays close to constant.
> + ? ? ? ?*/
> + ? ? ? delta = timespec_sub(old_system, old_rtc);
> + ? ? ? delta_delta = timespec_sub(delta, old_delta);
> + ? ? ? if (abs(delta_delta.tv_sec) ?>= 2) {
> + ? ? ? ? ? ? ? /*
> + ? ? ? ? ? ? ? ?* if delta_delta is too large, assume time correction
> + ? ? ? ? ? ? ? ?* has occured and set old_delta to the current delta.
> + ? ? ? ? ? ? ? ?*/
> + ? ? ? ? ? ? ? old_delta = delta;
> + ? ? ? } else {
> + ? ? ? ? ? ? ? /* Otherwise try to adjust old_system to compensate */
> + ? ? ? ? ? ? ? old_system = timespec_sub(old_system, delta_delta);
> + ? ? ? }
>
> ? ? ? ?return 0;
> ?}
> @@ -63,32 +84,42 @@ static int rtc_resume(struct device *dev)
> ?{
> ? ? ? ?struct rtc_device ? ? ? *rtc = to_rtc_device(dev);
> ? ? ? ?struct rtc_time ? ? ? ? tm;
> - ? ? ? time_t ? ? ? ? ? ? ? ? ?newtime;
> - ? ? ? struct timespec ? ? ? ? time;
> - ? ? ? struct timespec ? ? ? ? newts;
> + ? ? ? struct timespec ? ? ? ? new_system, new_rtc;
> + ? ? ? struct timespec ? ? ? ? sleep_time;
>
> ? ? ? ?if (strcmp(dev_name(&rtc->dev), CONFIG_RTC_HCTOSYS_DEVICE) != 0)
> ? ? ? ? ? ? ? ?return 0;
>
> - ? ? ? ktime_get_ts(&newts);
> + ? ? ? /* snapshot the current rtc and system time at resume */
> + ? ? ? getnstimeofday(&new_system);
> ? ? ? ?rtc_read_time(rtc, &tm);
> ? ? ? ?if (rtc_valid_tm(&tm) != 0) {
> ? ? ? ? ? ? ? ?pr_debug("%s: ?bogus resume time\n", dev_name(&rtc->dev));
> ? ? ? ? ? ? ? ?return 0;
> ? ? ? ?}
> - ? ? ? rtc_tm_to_time(&tm, &newtime);
> - ? ? ? if (newtime <= oldtime) {
> - ? ? ? ? ? ? ? if (newtime < oldtime)
> + ? ? ? rtc_tm_to_time(&tm, &new_rtc.tv_sec);
> + ? ? ? new_rtc.tv_nsec = 0;
> +
> + ? ? ? if (new_rtc.tv_sec <= old_rtc.tv_sec) {
> + ? ? ? ? ? ? ? if (new_rtc.tv_sec < old_rtc.tv_sec)
> ? ? ? ? ? ? ? ? ? ? ? ?pr_debug("%s: ?time travel!\n", dev_name(&rtc->dev));
> ? ? ? ? ? ? ? ?return 0;
> ? ? ? ?}
> - ? ? ? /* calculate the RTC time delta */
> - ? ? ? set_normalized_timespec(&time, newtime - oldtime, 0);
>
> - ? ? ? /* subtract kernel time between rtc_suspend to rtc_resume */
> - ? ? ? time = timespec_sub(time, timespec_sub(newts, oldts));
> + ? ? ? /* calculate the RTC time delta (sleep time)*/
> + ? ? ? sleep_time = timespec_sub(new_rtc, old_rtc);
> +
> + ? ? ? /*
> + ? ? ? ?* Since these RTC suspend/resume handlers are not called
> + ? ? ? ?* at the very end of suspend or the start of resume,
> + ? ? ? ?* some run-time may pass on either sides of the sleep time
> + ? ? ? ?* so subtract kernel run-time between rtc_suspend to rtc_resume
> + ? ? ? ?* to keep things accurate.
> + ? ? ? ?*/
> + ? ? ? sleep_time = timespec_sub(sleep_time,
> + ? ? ? ? ? ? ? ? ? ? ? timespec_sub(new_system, old_system));

What happens if sleep_time is negative? I think this need to be
clamped to 0 to avoid backwards jumps when you wake up more than once
without the rtc advancing.

>
> - ? ? ? timekeeping_inject_sleeptime(&time);
> + ? ? ? timekeeping_inject_sleeptime(&sleep_time);
> ? ? ? ?return 0;
> ?}
>
> --
> 1.7.3.2.146.gca209
>
>



--
Arve Hj?nnev?g

2011-06-02 01:14:32

by John Stultz

[permalink] [raw]
Subject: Re: [PATCH 2/2] rtc: Avoid accumulating time drift in suspend/resume

On Wed, 2011-06-01 at 17:54 -0700, Arve Hjønnevåg wrote:
> On Tue, May 31, 2011 at 11:07 PM, John Stultz <[email protected]> wrote:
> > + /*
> > + * Since these RTC suspend/resume handlers are not called
> > + * at the very end of suspend or the start of resume,
> > + * some run-time may pass on either sides of the sleep time
> > + * so subtract kernel run-time between rtc_suspend to rtc_resume
> > + * to keep things accurate.
> > + */
> > + sleep_time = timespec_sub(sleep_time,
> > + timespec_sub(new_system, old_system));
>
> What happens if sleep_time is negative? I think this need to be
> clamped to 0 to avoid backwards jumps when you wake up more than once
> without the rtc advancing.

Good thought! Although that will be easier to catch in
timekeeping_inject_sleeptime(), so I might add it there.

Thanks for the review!
-john