LinuxLists.cc - [PATCH 01/13] hrtimer: round up relative start time

2006-02-13 01:09:34

Subject: [PATCH 01/13] hrtimer: round up relative start time

When starting a relative timer we have to round it up the next clock
tick to avoid an early expiry. The problem is that we don't know the
real clock resolution, so we have to assume the worst case, but it's
basically the same as the old code did, so it won't be worse than 2.6.15
and with a better clock interface we can improve this.

Signed-off-by: Roman Zippel <[email protected]>

---

kernel/hrtimer.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

Index: linux-2.6-git/kernel/hrtimer.c
===================================================================
--- linux-2.6-git.orig/kernel/hrtimer.c 2006-02-12 18:32:48.000000000 +0100
+++ linux-2.6-git/kernel/hrtimer.c 2006-02-12 18:32:57.000000000 +0100
@@ -419,7 +419,8 @@ hrtimer_start(struct hrtimer *timer, kti
new_base = switch_hrtimer_base(timer, base);

if (mode == HRTIMER_REL)
- tim = ktime_add(tim, new_base->get_time());
+ tim = ktime_add(ktime_add(tim, new_base->get_time()),
+ base->resolution);
timer->expires = tim;

enqueue_hrtimer(timer, new_base);

2006-02-13 10:51:54

by Thomas Gleixner

[permalink] [raw]

Subject: Re: [PATCH 01/13] hrtimer: round up relative start time

On Mon, 2006-02-13 at 02:09 +0100, Roman Zippel wrote:
> When starting a relative timer we have to round it up the next clock
> tick to avoid an early expiry. The problem is that we don't know the
> real clock resolution, so we have to assume the worst case, but it's
> basically the same as the old code did, so it won't be worse than 2.6.15
> and with a better clock interface we can improve this.
>
> Signed-off-by: Roman Zippel <[email protected]>

NACK

This adds an artificial offset to the expiry time, for what reason? The
expiry code makes sure that timers can not expire early. See:

timer = rb_entry(node, struct hrtimer, node);
if (now.tv64 <= timer->expires.tv64)
break;

in kernel/hrtimers.c:run_hrtimer_queue(), where now is already tick
aligned.

Please provide a testcase (or detailed use-case) which proves that this
is necessary.

tglx

> ---
>
> kernel/hrtimer.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> Index: linux-2.6-git/kernel/hrtimer.c
> ===================================================================
> --- linux-2.6-git.orig/kernel/hrtimer.c 2006-02-12 18:32:48.000000000 +0100
> +++ linux-2.6-git/kernel/hrtimer.c 2006-02-12 18:32:57.000000000 +0100
> @@ -419,7 +419,8 @@ hrtimer_start(struct hrtimer *timer, kti
> new_base = switch_hrtimer_base(timer, base);
>
> if (mode == HRTIMER_REL)
> - tim = ktime_add(tim, new_base->get_time());
> + tim = ktime_add(ktime_add(tim, new_base->get_time()),
> + base->resolution);
> timer->expires = tim;
>
> enqueue_hrtimer(timer, new_base);

2006-02-13 11:25:46

by Roman Zippel

[permalink] [raw]

Subject: Re: [PATCH 01/13] hrtimer: round up relative start time

Hi,

On Mon, 13 Feb 2006, Thomas Gleixner wrote:

> This adds an artificial offset to the expiry time, for what reason? The
> expiry code makes sure that timers can not expire early. See:
>
> timer = rb_entry(node, struct hrtimer, node);
> if (now.tv64 <= timer->expires.tv64)
> break;
>
> in kernel/hrtimers.c:run_hrtimer_queue(), where now is already tick
> aligned.
>
> Please provide a testcase (or detailed use-case) which proves that this
> is necessary.

Let's assume a get_time() which simply returns xtime and so has a
resolution of around TICK_NSEC. This means the real time when one calls
get_time() is somewhere between xtime and xtime+TICK_NSEC. Assuming the
real time is xtime+TICK_NSEC-1, get_time() will return xtime and a
relative timer with TICK_NSEC-1 will expire immediately.
The old code did this correctly. For most hardware this is not a real
issue, as the delivery time is larger than the clock resolution, but
unless you can guarantee it's not an issue on _any_ supported hardware,
this fix is needed. As I already said this can be better fixed as soon as
we have a better clock abstraction, until then this is only restores the
old behaviour.

bye, Roman

2006-02-13 13:03:42

by Ingo Molnar

[permalink] [raw]

Subject: Re: [PATCH 01/13] hrtimer: round up relative start time

* Roman Zippel <[email protected]> wrote:

> > This adds an artificial offset to the expiry time, for what reason? The
> > expiry code makes sure that timers can not expire early. See:
> >
> > timer = rb_entry(node, struct hrtimer, node);
> > if (now.tv64 <= timer->expires.tv64)
> > break;
> >
> > in kernel/hrtimers.c:run_hrtimer_queue(), where now is already tick
> > aligned.
> >
> > Please provide a testcase (or detailed use-case) which proves that this
> > is necessary.
>
> Let's assume a get_time() which simply returns xtime and so has a
> resolution of around TICK_NSEC. This means the real time when one
> calls get_time() is somewhere between xtime and xtime+TICK_NSEC.
> Assuming the real time is xtime+TICK_NSEC-1, get_time() will return
> xtime and a relative timer with TICK_NSEC-1 will expire immediately.
> The old code did this correctly. For most hardware this is not a real
> issue, as the delivery time is larger than the clock resolution, but
> unless you can guarantee it's not an issue on _any_ supported
> hardware, this fix is needed. As I already said this can be better
> fixed as soon as we have a better clock abstraction, until then this
> is only restores the old behaviour.

but there is no 'old behavior' to restore to. The +1 to itimer intervals
caused artifacts that were hitting users and caused 2.4 -> 2.6 itimer
regressions, which hrtimers fixed. E.g.:

http://bugzilla.kernel.org/show_bug.cgi?id=3289

so i dont think restoring the first timeout of an interval timer to be
increased by resolution [which your patch does] has any meaning. It
'restores' to half of what 2.6 did prior hrtimers. Doing that would be
inconsistent and would push the 'sum-up' errors observed for interval
timers above to be again observable in user-space (if user-space does a
series of timeouts). What's the point?

Ingo

2006-02-13 13:43:01

by Roman Zippel

[permalink] [raw]

Subject: Re: [PATCH 01/13] hrtimer: round up relative start time

Hi,

On Mon, 13 Feb 2006, Ingo Molnar wrote:

> but there is no 'old behavior' to restore to. The +1 to itimer intervals
> caused artifacts that were hitting users and caused 2.4 -> 2.6 itimer
> regressions, which hrtimers fixed. E.g.:
>
> http://bugzilla.kernel.org/show_bug.cgi?id=3289

Ingo, please read correctly, this is mainly about interval timers, which
my patch doesn't change. My patch only fixes the initial start time.

bye, Roman

2006-02-13 14:46:04

by Ingo Molnar

[permalink] [raw]

Subject: Re: [PATCH 01/13] hrtimer: round up relative start time

* Roman Zippel <[email protected]> wrote:

> Hi,
>
> On Mon, 13 Feb 2006, Ingo Molnar wrote:
>
> > but there is no 'old behavior' to restore to. The +1 to itimer intervals
> > caused artifacts that were hitting users and caused 2.4 -> 2.6 itimer
> > regressions, which hrtimers fixed. E.g.:
> >
> > http://bugzilla.kernel.org/show_bug.cgi?id=3289
>
> Ingo, please read correctly, this is mainly about interval timers,
> which my patch doesn't change. My patch only fixes the initial start
> time.

Yeah, i know it's about the start time - what else could it possibly be
about? As i wrote:

> > so i dont think restoring the first timeout of an interval timer to
> > be increased by resolution [which your patch does] has any meaning.
> > It 'restores' to half of what 2.6 did prior hrtimers. Doing that
> > would be inconsistent and would push the 'sum-up' errors observed
> > for interval timers above to be again observable in user-space (if
> > user-space does a series of timeouts). What's the point?

Your change changes the initial start time to be longer by +1 jiffy. My
"restores to half of what 2.6 did" observation was in reference to the
start time. The other half is the interval time between timeouts. If you
add a +1 jiffy to the start time, you ought to do it for the interval
time too. Or do it for neither - which is what we chose to do.

Yes, the 2.6 regression in the bugzilla was _mainly_ about the intervals
adding a comulative +1, but obviously the behavior should be symmetric:
if we use our higher resolution for intervals, we should use it for the
start time too.

In other words: your patch re-introduces half of the bug on low-res
platforms. Users doing a series of one-shot itimer calls would be
exposed to the same kind of (incorrect and unnecessary) summing-up
errors. What's the point?

Ingo

2006-02-13 15:49:37

by Roman Zippel

[permalink] [raw]

Subject: Re: [PATCH 01/13] hrtimer: round up relative start time

Hi,

On Mon, 13 Feb 2006, Ingo Molnar wrote:

> In other words: your patch re-introduces half of the bug on low-res
> platforms. Users doing a series of one-shot itimer calls would be
> exposed to the same kind of (incorrect and unnecessary) summing-up
> errors. What's the point?

I don't fully agree with the interval behaviour either, but here one could
at least say it's correct on average. Since hrtimer is also used for
nanosleep(), which I consider more important (as e.g. posix timer), this
one should at least be correct and consistent with previous 2.6 releases.
I don't mind fixing it properly, but just omitting the rounding here is
simply not correct.

bye, Roman

2006-02-13 19:57:52

by Ingo Molnar

[permalink] [raw]

Subject: Re: [PATCH 01/13] hrtimer: round up relative start time

* Roman Zippel <[email protected]> wrote:

> Hi,
>
> On Mon, 13 Feb 2006, Ingo Molnar wrote:
>
> > In other words: your patch re-introduces half of the bug on low-res
> > platforms. Users doing a series of one-shot itimer calls would be
> > exposed to the same kind of (incorrect and unnecessary) summing-up
> > errors. What's the point?
>
> I don't fully agree with the interval behaviour either, [...]

i.e. you'd want to reintroduce the comulative interval rounding bug that
users noticed? Or do you have some other way to change it? I really dont
see your point.

> [...] but here one could at least say it's correct on average. [...]

i'm not sure i understand. Are you implying by this that some current
code is not "correct on average"?

> Since hrtimer is also used for nanosleep(), which I consider more
> important (as e.g. posix timer), this one should at least be correct
> and consistent with previous 2.6 releases. [...]

for me it's simple: i dont think we should reintroduce the same type of
concept that was clearly causing regressions in previous 2.6 releases.
Thomas, what do you think?

Ingo

2006-02-13 22:29:44

by Roman Zippel

[permalink] [raw]

Subject: Re: [PATCH 01/13] hrtimer: round up relative start time

Hi,

On Mon, 13 Feb 2006, Ingo Molnar wrote:

> > I don't fully agree with the interval behaviour either, [...]
>
> i.e. you'd want to reintroduce the comulative interval rounding bug that
> users noticed? Or do you have some other way to change it? I really dont
> see your point.

And I don't want to expand on it, because otherwise this thread goes
completely elsewhere again and I want to keep the focus on this patch.
These are two different problems, which have have only in common that it's
about rounding of time.

> > Since hrtimer is also used for nanosleep(), which I consider more
> > important (as e.g. posix timer), this one should at least be correct
> > and consistent with previous 2.6 releases. [...]
>
> for me it's simple: i dont think we should reintroduce the same type of
> concept that was clearly causing regressions in previous 2.6 releases.

You have a weird definition of "regression", since when is a bug fix a
regression? We can discuss whether it's the correct fix and I described
earlier in this thread the basic problem, which the current 2.6 behaviour
fixes. I'd really prefer if we could based on that discuss a proper fix,
instead of just falling back to the wrong 2.4 behaviour.

bye, Roman

2006-02-14 07:43:33

by Ingo Molnar

[permalink] [raw]

Subject: Re: [PATCH 01/13] hrtimer: round up relative start time

ok, lets go back to this one:

* Roman Zippel <[email protected]> wrote:

> Let's assume a get_time() which simply returns xtime and so has a
> resolution of around TICK_NSEC. This means the real time when one
> calls get_time() is somewhere between xtime and xtime+TICK_NSEC.
> Assuming the real time is xtime+TICK_NSEC-1, get_time() will return
> xtime and a relative timer with TICK_NSEC-1 will expire immediately.

i agree that on systems where get_time() has a TICK_NSEC resolution,
such short timeouts are bad.

i dont agree with the fix though: it penalizes platforms where
->get_time() resolution is sane.

Ingo

2006-02-14 10:18:19

by Roman Zippel

[permalink] [raw]

Subject: Re: [PATCH 01/13] hrtimer: round up relative start time

Hi,

On Tue, 14 Feb 2006, Ingo Molnar wrote:

> > Let's assume a get_time() which simply returns xtime and so has a
> > resolution of around TICK_NSEC. This means the real time when one
> > calls get_time() is somewhere between xtime and xtime+TICK_NSEC.
> > Assuming the real time is xtime+TICK_NSEC-1, get_time() will return
> > xtime and a relative timer with TICK_NSEC-1 will expire immediately.
>
> i agree that on systems where get_time() has a TICK_NSEC resolution,
> such short timeouts are bad.
>
> i dont agree with the fix though: it penalizes platforms where
> ->get_time() resolution is sane.

How do you want to tell one from the other?
Can we agree, that this is the behaviour 2.6 currently already has anyway?
I fully agree, that this patch is not the best solution, but is it really
such a problem that we can't postpone the behaviour change for a short
while until we can fix it properly (i.e. via a proper clock framework)?

bye, Roman

2006-02-14 10:25:45

by Thomas Gleixner

[permalink] [raw]

Subject: Re: [PATCH 01/13] hrtimer: round up relative start time

On Tue, 2006-02-14 at 08:41 +0100, Ingo Molnar wrote:
> ok, lets go back to this one:
>
> * Roman Zippel <[email protected]> wrote:
>
> > Let's assume a get_time() which simply returns xtime and so has a
> > resolution of around TICK_NSEC. This means the real time when one
> > calls get_time() is somewhere between xtime and xtime+TICK_NSEC.
> > Assuming the real time is xtime+TICK_NSEC-1, get_time() will return
> > xtime and a relative timer with TICK_NSEC-1 will expire immediately.
>
> i agree that on systems where get_time() has a TICK_NSEC resolution,
> such short timeouts are bad.
>
> i dont agree with the fix though: it penalizes platforms where
> ->get_time() resolution is sane.

Thats true, but we have no information about get_time() resolution at
all. So the only way to work around that for now is Romans fix even if
we add the penalty to _all_ platforms.

tglx

2006-02-14 12:22:16

by Ingo Molnar

[permalink] [raw]

Subject: [patch] hrtimer: round up relative start time on low-res arches

find below a patch with less impact than blanket upwards rounding of
relatime timeouts. This i think is better for v2.6.16 - it accurately
documents the problem architectures that have low-res do_gettimeofday(),
and doesnt impact other architectures. This will go away with John's
GTOD patchset. I've reviewed every architecture for the (worst-case)
resolution of their do_gettimeofday() implementations, and this is the
result of that review.

Ingo

-----

CONFIG_TIME_LOW_RES is a temporary way for architectures to signal that
they simply return xtime in do_gettimeoffset(). In this corner-case we
want to round up by resolution when starting a relative timer, to avoid
short timeouts. This will go away with the GTOD framework.

Signed-off-by: Ingo Molnar <[email protected]>

----

arch/frv/Kconfig | 4 ++++
arch/h8300/Kconfig | 4 ++++
arch/m68k/Kconfig | 4 ++++
arch/m68knommu/Kconfig | 4 ++++
arch/parisc/Kconfig | 5 +++++
arch/v850/Kconfig | 4 ++++
kernel/hrtimer.c | 13 ++++++++++++-
7 files changed, 37 insertions(+), 1 deletion(-)

Index: linux/arch/frv/Kconfig
===================================================================
--- linux.orig/arch/frv/Kconfig
+++ linux/arch/frv/Kconfig
@@ -25,6 +25,10 @@ config GENERIC_HARDIRQS
bool
default n

+config TIME_LOW_RES
+ bool
+ default y
+
mainmenu "Fujitsu FR-V Kernel Configuration"

source "init/Kconfig"
Index: linux/arch/h8300/Kconfig
===================================================================
--- linux.orig/arch/h8300/Kconfig
+++ linux/arch/h8300/Kconfig
@@ -33,6 +33,10 @@ config GENERIC_CALIBRATE_DELAY
bool
default y

+config TIME_LOW_RES
+ bool
+ default y
+
config ISA
bool
default y
Index: linux/arch/m68k/Kconfig
===================================================================
--- linux.orig/arch/m68k/Kconfig
+++ linux/arch/m68k/Kconfig
@@ -21,6 +21,10 @@ config GENERIC_CALIBRATE_DELAY
bool
default y

+config TIME_LOW_RES
+ bool
+ default y
+
config ARCH_MAY_HAVE_PC_FDC
bool
depends on Q40 || (BROKEN && SUN3X)
Index: linux/arch/m68knommu/Kconfig
===================================================================
--- linux.orig/arch/m68knommu/Kconfig
+++ linux/arch/m68knommu/Kconfig
@@ -29,6 +29,10 @@ config GENERIC_CALIBRATE_DELAY
bool
default y

+config TIME_LOW_RES
+ bool
+ default y
+
source "init/Kconfig"

menu "Processor type and features"
Index: linux/arch/parisc/Kconfig
===================================================================
--- linux.orig/arch/parisc/Kconfig
+++ linux/arch/parisc/Kconfig
@@ -29,6 +29,11 @@ config GENERIC_CALIBRATE_DELAY
bool
default y

+config TIME_LOW_RES
+ bool
+ depends on SMP
+ default y
+
config GENERIC_ISA_DMA
bool

Index: linux/arch/v850/Kconfig
===================================================================
--- linux.orig/arch/v850/Kconfig
+++ linux/arch/v850/Kconfig
@@ -28,6 +28,10 @@ config GENERIC_IRQ_PROBE
bool
default y

+config TIME_LOW_RES
+ bool
+ default y
+
# Turn off some random 386 crap that can affect device config
config ISA
bool
Index: linux/kernel/hrtimer.c
===================================================================
--- linux.orig/kernel/hrtimer.c
+++ linux/kernel/hrtimer.c
@@ -418,8 +418,19 @@ hrtimer_start(struct hrtimer *timer, kti
/* Switch the timer base, if necessary: */
new_base = switch_hrtimer_base(timer, base);

- if (mode == HRTIMER_REL)
+ if (mode == HRTIMER_REL) {
tim = ktime_add(tim, new_base->get_time());
+ /*
+ * CONFIG_TIME_LOW_RES is a temporary way for architectures
+ * to signal that they simply return xtime in
+ * do_gettimeoffset(). In this case we want to round up by
+ * resolution when starting a relative timer, to avoid short
+ * timeouts. This will go away with the GTOD framework.
+ */
+#ifdef CONFIG_TIME_LOW_RES
+ tim = ktime_add(tim, base->resolution);
+#endif
+ }
timer->expires = tim;

enqueue_hrtimer(timer, new_base);

2006-02-14 21:51:21

by Thomas Gleixner

[permalink] [raw]

Subject: Re: [patch] hrtimer: round up relative start time on low-res arches

On Tue, 2006-02-14 at 13:20 +0100, Ingo Molnar wrote:
> CONFIG_TIME_LOW_RES is a temporary way for architectures to signal that
> they simply return xtime in do_gettimeoffset(). In this corner-case we
> want to round up by resolution when starting a relative timer, to avoid
> short timeouts. This will go away with the GTOD framework.
>
> Signed-off-by: Ingo Molnar <[email protected]>

Acked-by: Thomas Gleixner <[email protected]>

2006-02-15 00:30:18

by Roman Zippel

[permalink] [raw]

Subject: Re: [patch] hrtimer: round up relative start time on low-res arches

Hi,

On Tue, 14 Feb 2006, Ingo Molnar wrote:

> CONFIG_TIME_LOW_RES is a temporary way for architectures to signal that
> they simply return xtime in do_gettimeoffset(). In this corner-case we
> want to round up by resolution when starting a relative timer, to avoid
> short timeouts. This will go away with the GTOD framework.

This fixes the worst cases. Even the common case should somehow reflect
that the relative start time should be rounded up in the same way (even
if not by that much), e.g. due to rounding the current get_time() (at
least for the non TIME_INTERPOLATION case) has a 1usec resolution, which
should be added there.

bye, Roman

2006-02-15 09:21:55

by Ingo Molnar

[permalink] [raw]

Subject: Re: [patch] hrtimer: round up relative start time on low-res arches

* Roman Zippel <[email protected]> wrote:

> On Tue, 14 Feb 2006, Ingo Molnar wrote:
>
> > CONFIG_TIME_LOW_RES is a temporary way for architectures to signal that
> > they simply return xtime in do_gettimeoffset(). In this corner-case we
> > want to round up by resolution when starting a relative timer, to avoid
> > short timeouts. This will go away with the GTOD framework.
>
> This fixes the worst cases. Even the common case should somehow
> reflect that the relative start time should be rounded up in the same
> way (even if not by that much), e.g. due to rounding the current
> get_time() (at least for the non TIME_INTERPOLATION case) has a 1usec
> resolution, which should be added there.

yeah, agreed. That will be accurately fixed via GTOD's per-hwclock
resolution values. It will have another advantage as well: e.g. the
whole of m68k wont be penalized via CONFIG_TIME_LOW_RES for having a
handful of sub-arches (Apollo, Sun3x, Q40) that dont have a higher
resolution timer - every clock can define its own resolution. You could
help that effort by porting m68k to use GTOD ;-)

Ingo

2006-02-15 12:26:17

by Roman Zippel

[permalink] [raw]

Subject: Re: [patch] hrtimer: round up relative start time on low-res arches

Hi,

On Wed, 15 Feb 2006, Ingo Molnar wrote:

> yeah, agreed. That will be accurately fixed via GTOD's per-hwclock
> resolution values. It will have another advantage as well: e.g. the
> whole of m68k wont be penalized via CONFIG_TIME_LOW_RES for having a
> handful of sub-arches (Apollo, Sun3x, Q40) that dont have a higher
> resolution timer - every clock can define its own resolution. You could
> help that effort by porting m68k to use GTOD ;-)

I'll do that as soon as the perfomance is equal or better than what we
have right now and expensive 64bit math in the fast path, where it's
provably a waste, is not exactly encouraging. I already provided all the
math and code to keep it cheap and (relatively) simple, but I don't have
the time to work constantly on it, so if you'd help to integrate it into
John's work it would go a lot faster.

bye, Roman

2006-02-15 19:30:47

by john stultz

[permalink] [raw]

Subject: Re: [patch] hrtimer: round up relative start time on low-res arches

On Wed, 2006-02-15 at 13:26 +0100, Roman Zippel wrote:
> Hi,
>
> On Wed, 15 Feb 2006, Ingo Molnar wrote:
>
> > yeah, agreed. That will be accurately fixed via GTOD's per-hwclock
> > resolution values. It will have another advantage as well: e.g. the
> > whole of m68k wont be penalized via CONFIG_TIME_LOW_RES for having a
> > handful of sub-arches (Apollo, Sun3x, Q40) that dont have a higher
> > resolution timer - every clock can define its own resolution. You could
> > help that effort by porting m68k to use GTOD ;-)
>
> I'll do that as soon as the perfomance is equal or better than what we
> have right now and expensive 64bit math in the fast path, where it's
> provably a waste, is not exactly encouraging. I already provided all the
> math and code to keep it cheap and (relatively) simple, but I don't have
> the time to work constantly on it, so if you'd help to integrate it into
> John's work it would go a lot faster.

Hey Roman,
I just wanted to make sure you know I'm not ignoring your suggestions.
I do appreciate the time you have spent, and I have been continuing to
work on implementing your idea. Unfortunately the code is not trivial
and very much hurts the readability. I expect thats a sacrifice that
will be necessary, but hopefully after some review cycles we'll be able
to come to something we both like.

I'm hoping to have a first pass patch I can mail this week.

thanks
-john

2006-02-16 14:11:11

by Roman Zippel

[permalink] [raw]

Subject: Re: [patch] hrtimer: round up relative start time on low-res arches

Hi,

On Wed, 15 Feb 2006, john stultz wrote:

> I just wanted to make sure you know I'm not ignoring your suggestions.
> I do appreciate the time you have spent, and I have been continuing to
> work on implementing your idea. Unfortunately the code is not trivial
> and very much hurts the readability. I expect thats a sacrifice that
> will be necessary, but hopefully after some review cycles we'll be able
> to come to something we both like.

The code could be cleaned up a little, but the main difference is that my
code is much more compact, it lacks all the redundancy of your code, which
makes it harder to read. My hope was here instead of putting back the
code redundancy to add documentation instead, which would explain the
subtleties.
I actually think that the basic principle of my code is quite simple, the
problem is that it's a little buried inbetween how the incremental updates
are done in my prototype, so after a little cleaning and separating the
special cases, I think my code would be a lot more readable.

> I'm hoping to have a first pass patch I can mail this week.

I'm looking forward to it.
BTW What do you think how difficult it would be to rebase your patches on
my NTP4 patches? In the end the simplification of my patches should also
make your patches simpler, as it precalculates as much as possible and
reduces the work done in the fast paths. It would avoid a lot of extra
work, which you currently do.
The main problem that I see is that you need to drop the ppm calculations,
the new code provides a scaled nsec value per tick (tick_nsec + time_adj)
and you basically only need "(update - 10^9/HZ) / cycles" to calculate the
new multiplier adjustment.
You also need to drop your ntp rework, the changes to the generic code
should be quite simple now, basically just exporting second_overflow()
and creating an alternative do_timer() entry point which doesn't call
update_wall_time().

Besides some small cleanups I think my code is ready for some serious
testing, but it conflicts with your NTP changes.

bye, Roman

2006-02-16 19:06:18

by john stultz

[permalink] [raw]

Subject: Re: [patch] hrtimer: round up relative start time on low-res arches

On Thu, 2006-02-16 at 15:10 +0100, Roman Zippel wrote:
> On Wed, 15 Feb 2006, john stultz wrote:
>
> > I just wanted to make sure you know I'm not ignoring your suggestions.
> > I do appreciate the time you have spent, and I have been continuing to
> > work on implementing your idea. Unfortunately the code is not trivial
> > and very much hurts the readability. I expect thats a sacrifice that
> > will be necessary, but hopefully after some review cycles we'll be able
> > to come to something we both like.
>
> The code could be cleaned up a little, but the main difference is that my
> code is much more compact, it lacks all the redundancy of your code, which
> makes it harder to read. My hope was here instead of putting back the
> code redundancy to add documentation instead, which would explain the
> subtleties.
> I actually think that the basic principle of my code is quite simple, the
> problem is that it's a little buried inbetween how the incremental updates
> are done in my prototype, so after a little cleaning and separating the
> special cases, I think my code would be a lot more readable.

I'll admit I'm slow, but since it has taken me a number of weeks to sort
out exactly the details of what is being done in your implementation,
and I *still* have bugs after re-implementing it, I'd not claim your
code is simple. The potential to be very precise and efficient: yes, but
not so trivial to follow.

(This is why I cringe at the idea of trying to implement it for each
clock like you're prototype suggested.)

Maybe when I send out the patch you can suggest improvements to the
comments or other ways to better clarify the code as you suggested
above.

> > I'm hoping to have a first pass patch I can mail this week.
>
> I'm looking forward to it.
> BTW What do you think how difficult it would be to rebase your patches on
> my NTP4 patches?

I'd be very much open to it, although I haven't seen any further updates
to the code since I mailed you some feedback on them. Have you had a
chance to follow up on those?

> In the end the simplification of my patches should also
> make your patches simpler, as it precalculates as much as possible and
> reduces the work done in the fast paths. It would avoid a lot of extra
> work, which you currently do.

Well, I'm still cautious, since it still has some dependencies on HZ
(see equation below), which I'm trying to avoid. However I don't think
your patches keep me from getting the info I need (or atleast, the info
I think I need ;). They do seem to help close the gap between what I
want and what you want, so I think they are a good step forward.

> The main problem that I see is that you need to drop the ppm calculations,
> the new code provides a scaled nsec value per tick (tick_nsec + time_adj)
> and you basically only need "(update - 10^9/HZ) / cycles" to calculate the
> new multiplier adjustment.

My test patch does this already, although for now it calculates the ntp
interval (something like "tick_nsec + time_adj" in your code) from the
ppm value. That calculation would hopefully be replaced w/ some
ntp_get_interval() call that would pull the equivalent from your code.

> You also need to drop your ntp rework, the changes to the generic code
> should be quite simple now, basically just exporting second_overflow()
> and creating an alternative do_timer() entry point which doesn't call
> update_wall_time().
>
> Besides some small cleanups I think my code is ready for some serious
> testing, but it conflicts with your NTP changes.

If you think they're ready, mail them to the list and I'll look them
over then take a swing at re-basing my work on top of them.

thanks
-john

2006-02-16 23:44:52

by Roman Zippel

[permalink] [raw]

Subject: Re: [patch] hrtimer: round up relative start time on low-res arches

Hi,

On Thu, 16 Feb 2006, john stultz wrote:

> I'll admit I'm slow, but since it has taken me a number of weeks to sort
> out exactly the details of what is being done in your implementation,
> and I *still* have bugs after re-implementing it, I'd not claim your
> code is simple. The potential to be very precise and efficient: yes, but
> not so trivial to follow.

Wow, I didn't expect it to be that difficult, I'm sorry about that. :)
Next time I'll split it apart into the basic parts, so it should be easier
to read and follow.

> (This is why I cringe at the idea of trying to implement it for each
> clock like you're prototype suggested.)

I didn't suggest that, the function itself is already quite generic and
could be called like "clock_update(cycles);". What I'm suggesting is to
make it a template function, so that one create a optimized version based
on some parameters (e.g. it's possible to optimize some parts away by
making them constant).
That's not really necessary for the first version, only that a clock can
call the "clock_update(cycles);" directly from it's interrupt handler,
later it can be replaced with an optimized version.

> Maybe when I send out the patch you can suggest improvements to the
> comments or other ways to better clarify the code as you suggested
> above.

Now I'm really curious. :)

> I'd be very much open to it, although I haven't seen any further updates
> to the code since I mailed you some feedback on them. Have you had a
> chance to follow up on those?

Not yet, but it would only minor updates (mostly documenting it better and
cleanups as you suggested), the basic stuff is still the same.

> > In the end the simplification of my patches should also
> > make your patches simpler, as it precalculates as much as possible and
> > reduces the work done in the fast paths. It would avoid a lot of extra
> > work, which you currently do.
>
> Well, I'm still cautious, since it still has some dependencies on HZ
> (see equation below), which I'm trying to avoid.

There is no real dependency on HZ, it's just that the synchronisations
steps and incremental updates are done in fixed intervals. The interval
could easily be independent of HZ.

bye, Roman

2006-02-17 00:28:09

by john stultz

[permalink] [raw]

Subject: Re: [patch] hrtimer: round up relative start time on low-res arches

On Fri, 2006-02-17 at 00:44 +0100, Roman Zippel wrote:
> On Thu, 16 Feb 2006, john stultz wrote:
> > > In the end the simplification of my patches should also
> > > make your patches simpler, as it precalculates as much as possible and
> > > reduces the work done in the fast paths. It would avoid a lot of extra
> > > work, which you currently do.
> >
> > Well, I'm still cautious, since it still has some dependencies on HZ
> > (see equation below), which I'm trying to avoid.
>
> There is no real dependency on HZ, it's just that the synchronisations
> steps and incremental updates are done in fixed intervals. The interval
> could easily be independent of HZ.

Ok, one concern was that in the cycle->interval conversion, some
interval lengths are not possible due to the clock's resolution.

In my mind, I'd like to provide a interval length to the NTP code and
have the NTP code provide an adjusted interval which can be used in
error accumulation and the resulting multiplier adjustment.

Or, we just write off the cycle->interval error as part of the clock's
natural error and let the NTP daemon compensate for it. Your thoughts?

Regardless, the point is that I'd prefer if the timeofday code to be
able to specify to the NTP code what the interval length is, rather then
the other way around. Does that sound reasonable?

thanks
-john

2006-02-17 15:02:38

by Roman Zippel

[permalink] [raw]

Subject: Re: [patch] hrtimer: round up relative start time on low-res arches

Hi,

On Thu, 16 Feb 2006, john stultz wrote:

> > There is no real dependency on HZ, it's just that the synchronisations
> > steps and incremental updates are done in fixed intervals. The interval
> > could easily be independent of HZ.
>
> Ok, one concern was that in the cycle->interval conversion, some
> interval lengths are not possible due to the clock's resolution.
>
> In my mind, I'd like to provide a interval length to the NTP code and
> have the NTP code provide an adjusted interval which can be used in
> error accumulation and the resulting multiplier adjustment.
>
> Or, we just write off the cycle->interval error as part of the clock's
> natural error and let the NTP daemon compensate for it. Your thoughts?

Here my example does correct for this error, it accumulates the difference
between the clock update and the desired ntp update and once it's large
enough, it's corrected by a multiplier change.

> Regardless, the point is that I'd prefer if the timeofday code to be
> able to specify to the NTP code what the interval length is, rather then
> the other way around. Does that sound reasonable?

I don't understand what the advantage would be, the NTP code needs both
and the time interval is actually the variable part, so AFAICT it would
make the NTP code only more complex. (NTP changes the amount of time to be
passed per tick to adjust clock speed and not the other way around.)

bye, Roman