2011-05-18 21:35:29

by Thomas Gleixner

[permalink] [raw]
Subject: [patch 2/7] clocksource: Get rid of the hardcoded 5 seconds sleep time limit

Slow clocksources can have a way longer sleep time than 5 seconds and
even fast ones can easily cope with 600 seconds and still maintain
proper accuracy.

Signed-off-by: Thomas Gleixner <[email protected]>
---
kernel/time/clocksource.c | 38 +++++++++++++++++++-------------------
1 file changed, 19 insertions(+), 19 deletions(-)

Index: linux-2.6-tip/kernel/time/clocksource.c
===================================================================
--- linux-2.6-tip.orig/kernel/time/clocksource.c
+++ linux-2.6-tip/kernel/time/clocksource.c
@@ -626,19 +626,6 @@ static void clocksource_enqueue(struct c
list_add(&cs->list, entry);
}

-
-/*
- * Maximum time we expect to go between ticks. This includes idle
- * tickless time. It provides the trade off between selecting a
- * mult/shift pair that is very precise but can only handle a short
- * period of time, vs. a mult/shift pair that can handle long periods
- * of time but isn't as precise.
- *
- * This is a subsystem constant, and actual hardware limitations
- * may override it (ie: clocksources that wrap every 3 seconds).
- */
-#define MAX_UPDATE_LENGTH 5 /* Seconds */
-
/**
* __clocksource_updatefreq_scale - Used update clocksource with new freq
* @t: clocksource to be registered
@@ -652,15 +639,28 @@ static void clocksource_enqueue(struct c
*/
void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
{
+ unsigned long sec;
+
/*
- * Ideally we want to use some of the limits used in
- * clocksource_max_deferment, to provide a more informed
- * MAX_UPDATE_LENGTH. But for now this just gets the
- * register interface working properly.
+ * Calc the maximum number of seconds which we can run before
+ * wrapping around. For clocksources which have a mask > 32bit
+ * we need to limit the max sleep time to have a good
+ * conversion precision. 10 minutes is still a reasonable
+ * amount. That results in a shift value of 24 for a
+ * clocksource with mask >= 40bit and f >= 4GHz. That maps to
+ * ~ 0.06ppm granularity for NTP. We apply the same 12.5%
+ * margin as we do in clocksource_max_deferment()
*/
+ sec = (cs->mask - (cs->mask >> 5));
+ do_div(sec, freq);
+ do_div(sec, scale);
+ if (!sec)
+ sec = 1;
+ else if (sec > 600 && cs->mask > UINT_MAX)
+ sec = 600;
+
clocks_calc_mult_shift(&cs->mult, &cs->shift, freq,
- NSEC_PER_SEC/scale,
- MAX_UPDATE_LENGTH*scale);
+ NSEC_PER_SEC / scale, sec * scale);
cs->max_idle_ns = clocksource_max_deferment(cs);
}
EXPORT_SYMBOL_GPL(__clocksource_updatefreq_scale);


2011-05-19 00:57:41

by John Stultz

[permalink] [raw]
Subject: Re: [patch 2/7] clocksource: Get rid of the hardcoded 5 seconds sleep time limit

On Wed, 2011-05-18 at 21:33 +0000, Thomas Gleixner wrote:
> plain text document attachment
> (clocksource-make-shift-mult-calc-more-clever.patch)
> Slow clocksources can have a way longer sleep time than 5 seconds and
> even fast ones can easily cope with 600 seconds and still maintain
> proper accuracy.
>
> Signed-off-by: Thomas Gleixner <[email protected]>
> ---
> kernel/time/clocksource.c | 38 +++++++++++++++++++-------------------
> 1 file changed, 19 insertions(+), 19 deletions(-)
>
> Index: linux-2.6-tip/kernel/time/clocksource.c
> ===================================================================
> --- linux-2.6-tip.orig/kernel/time/clocksource.c
> +++ linux-2.6-tip/kernel/time/clocksource.c
> @@ -626,19 +626,6 @@ static void clocksource_enqueue(struct c
> list_add(&cs->list, entry);
> }
>
> -
> -/*
> - * Maximum time we expect to go between ticks. This includes idle
> - * tickless time. It provides the trade off between selecting a
> - * mult/shift pair that is very precise but can only handle a short
> - * period of time, vs. a mult/shift pair that can handle long periods
> - * of time but isn't as precise.
> - *
> - * This is a subsystem constant, and actual hardware limitations
> - * may override it (ie: clocksources that wrap every 3 seconds).
> - */
> -#define MAX_UPDATE_LENGTH 5 /* Seconds */
> -
> /**
> * __clocksource_updatefreq_scale - Used update clocksource with new freq
> * @t: clocksource to be registered
> @@ -652,15 +639,28 @@ static void clocksource_enqueue(struct c
> */
> void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
> {
> + unsigned long sec;
> +
> /*
> - * Ideally we want to use some of the limits used in
> - * clocksource_max_deferment, to provide a more informed
> - * MAX_UPDATE_LENGTH. But for now this just gets the
> - * register interface working properly.
> + * Calc the maximum number of seconds which we can run before
> + * wrapping around. For clocksources which have a mask > 32bit
> + * we need to limit the max sleep time to have a good
> + * conversion precision. 10 minutes is still a reasonable
> + * amount. That results in a shift value of 24 for a
> + * clocksource with mask >= 40bit and f >= 4GHz. That maps to
> + * ~ 0.06ppm granularity for NTP. We apply the same 12.5%
> + * margin as we do in clocksource_max_deferment()
> */

So, its not clear to me that the 12.5% margin really needed, since we
take it out when we calculate clocksource_max_deferment(). Although with
or without the margin I get the same mult/shift/max_idle_ns values for
the hardware I'm testing.

Another nice detail from the change: It doesn't affect clocksources that
normally wrap quickly:

Without your patch:
--------------
JDB: hpet mult: 2684354560 shift: 26 max_idle: 83214991360
JDB: acpi_pm mult: 2343484437 shift: 23 max_idle: 4540500826
JDB: tsc mult: 1342183991 shift: 31 max_idle: 2600481483

With your patch:
---------------
JDB: hpet mult: 2684354560 shift: 26 max_idle: 83214991360
JDB: acpi_pm mult: 2343484437 shift: 23 max_idle: 4540500826
JDB: tsc mult: 10485812 shift: 24 max_idle: 332861616128

Although interestingly 332 secs calculated for the max idle is more then
12% off of 600.

> + sec = (cs->mask - (cs->mask >> 5));
> + do_div(sec, freq);
> + do_div(sec, scale);
> + if (!sec)
> + sec = 1;
> + else if (sec > 600 && cs->mask > UINT_MAX)
> + sec = 600;
> +
> clocks_calc_mult_shift(&cs->mult, &cs->shift, freq,
> - NSEC_PER_SEC/scale,
> - MAX_UPDATE_LENGTH*scale);
> + NSEC_PER_SEC / scale, sec * scale);
> cs->max_idle_ns = clocksource_max_deferment(cs);
> }
> EXPORT_SYMBOL_GPL(__clocksource_updatefreq_scale);

Overall it looks good. I'm doing some NTP testing with the TSC shift
change to make sure we don't get any odd side effects. I'll let you know
how those go.

thanks
-john


2011-05-19 08:43:27

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [patch 2/7] clocksource: Get rid of the hardcoded 5 seconds sleep time limit

On Wed, 18 May 2011, John Stultz wrote:
> On Wed, 2011-05-18 at 21:33 +0000, Thomas Gleixner wrote:
> > - * Ideally we want to use some of the limits used in
> > - * clocksource_max_deferment, to provide a more informed
> > - * MAX_UPDATE_LENGTH. But for now this just gets the
> > - * register interface working properly.
> > + * Calc the maximum number of seconds which we can run before
> > + * wrapping around. For clocksources which have a mask > 32bit
> > + * we need to limit the max sleep time to have a good
> > + * conversion precision. 10 minutes is still a reasonable
> > + * amount. That results in a shift value of 24 for a
> > + * clocksource with mask >= 40bit and f >= 4GHz. That maps to
> > + * ~ 0.06ppm granularity for NTP. We apply the same 12.5%
> > + * margin as we do in clocksource_max_deferment()
> > */
>
> So, its not clear to me that the 12.5% margin really needed, since we
> take it out when we calculate clocksource_max_deferment(). Although with
> or without the margin I get the same mult/shift/max_idle_ns values for
> the hardware I'm testing.
>
> Another nice detail from the change: It doesn't affect clocksources that
> normally wrap quickly:
>
> Without your patch:
> --------------
> JDB: hpet mult: 2684354560 shift: 26 max_idle: 83214991360
> JDB: acpi_pm mult: 2343484437 shift: 23 max_idle: 4540500826
> JDB: tsc mult: 1342183991 shift: 31 max_idle: 2600481483
>
> With your patch:
> ---------------
> JDB: hpet mult: 2684354560 shift: 26 max_idle: 83214991360
> JDB: acpi_pm mult: 2343484437 shift: 23 max_idle: 4540500826
> JDB: tsc mult: 10485812 shift: 24 max_idle: 332861616128
>
> Although interestingly 332 secs calculated for the max idle is more then
> 12% off of 600.

We probably need to look at the math of clocksource_max_deferment()
again.

> Overall it looks good. I'm doing some NTP testing with the TSC shift
> change to make sure we don't get any odd side effects. I'll let you know
> how those go.

Ok.

Thanks,

tglx

2011-05-20 01:11:00

by john stultz

[permalink] [raw]
Subject: Re: [patch 2/7] clocksource: Get rid of the hardcoded 5 seconds sleep time limit

On Wed, 2011-05-18 at 17:57 -0700, John Stultz wrote:
> Overall it looks good. I'm doing some NTP testing with the TSC shift
> change to make sure we don't get any odd side effects. I'll let you know
> how those go.

So from overnight testing all looks well.

thanks
-john