2006-02-13 08:29:00

by Ulrich Windl

[permalink] [raw]
Subject: 2.6.15:kernel/time.c: The Nanosecond and code duplication

Hi!

I'm working on an integration of current NTP kernel algorithms for Linux 2.6.
xtime now has nanosecond resolution, but there's no POSIX like syscall interface
(clock_getres, clock_gettime, clock_settime) yet.

There's a hacked-on getnstimeofday() which, what I discovered doesn't actually
pass along the nanosecond resolution of xtime. It does:

void getnstimeofday(struct timespec *tv)
{
struct timeval x;

do_gettimeofday(&x);
tv->tv_sec = x.tv_sec;
tv->tv_nsec = x.tv_usec * NSEC_PER_USEC;
}

The proper solution most likely is to define POSIX compatible routines with
nanosecond resolution, and then define the microsecond-resolution from those, and
not the other way round.

Also there are severe religious wars on how a clock interface should look like for
a particular architecture. Besides the time interpolator there are architecture-
specific get_offset() calls. While making some people happy, it causes a code
explosion considering amount and complexity of code. I'd strongly prefer one time
variable (xtime) and an interpolator for the time elapsed since xtime was updates,
combinded with a method how to get consistent time. That's bad IMHO.

To make a long story short, here's a patch (just for inspiring you) I made to get
the nanoseconds available to other modules and to user land (via new methods
outside this patch):

Index: kernel/time.c
===================================================================
RCS file: /root/LinuxCVS/Kernel/kernel/time.c,v
retrieving revision 1.1.1.6.2.1
diff -u -r1.1.1.6.2.1 time.c
--- kernel/time.c 11 Feb 2006 18:16:28 -0000 1.1.1.6.2.1
+++ kernel/time.c 12 Feb 2006 17:30:51 -0000
@@ -1405,26 +1407,36 @@
}
EXPORT_SYMBOL(timespec_trunc);

-#ifdef CONFIG_TIME_INTERPOLATION
+/* get system time with nanosecond accuracy */
void getnstimeofday (struct timespec *tv)
{
- unsigned long seq,sec,nsec;
-
+ unsigned long seq, nsec, sec, offset;
do {
seq = read_seqbegin(&xtime_lock);
+#ifdef CONFIG_TIME_INTERPOLATION
+ offset = time_interpolator_get_offset();
+#else
+ offset = 0;
+#endif
sec = xtime.tv_sec;
- nsec = xtime.tv_nsec+time_interpolator_get_offset();
+ nsec = xtime.tv_nsec + offset;
} while (unlikely(read_seqretry(&xtime_lock, seq)));

+#ifdef CONFIG_TIME_INTERPOLATION
while (unlikely(nsec >= NSEC_PER_SEC)) {
nsec -= NSEC_PER_SEC;
++sec;
}
+#endif
tv->tv_sec = sec;
tv->tv_nsec = nsec;
}
EXPORT_SYMBOL_GPL(getnstimeofday);

+#ifdef CONFIG_TIME_INTERPOLATION
+/* this is a mess: there are also architecture-dependent ``do_gettimeofday()''
+ * and ``do_settimeofday()''
+ */
int do_settimeofday (struct timespec *tv)
{
time_t wtm_sec, sec = tv->tv_sec;
@@ -1451,42 +1463,14 @@

void do_gettimeofday (struct timeval *tv)
{
- unsigned long seq, nsec, usec, sec, offset;
- do {
- seq = read_seqbegin(&xtime_lock);
- offset = time_interpolator_get_offset();
- sec = xtime.tv_sec;
- nsec = xtime.tv_nsec;
- } while (unlikely(read_seqretry(&xtime_lock, seq)));
+ struct timespec ts;

- usec = (nsec + offset) / 1000;
-
- while (unlikely(usec >= USEC_PER_SEC)) {
- usec -= USEC_PER_SEC;
- ++sec;
- }
-
- tv->tv_sec = sec;
- tv->tv_usec = usec;
+ getnstimeofday(&ts);
+ tv->tv_sec = ts.tv_sec;
+ tv->tv_usec = (ts.tv_nsec + 500) / 1000;
}

EXPORT_SYMBOL(do_gettimeofday);
-
-
-#else
-/*
- * Simulate gettimeofday using do_gettimeofday which only allows a timeval
- * and therefore only yields usec accuracy
- */
-void getnstimeofday(struct timespec *tv)
-{
- struct timeval x;
-
- do_gettimeofday(&x);
- tv->tv_sec = x.tv_sec;
- tv->tv_nsec = x.tv_usec * NSEC_PER_USEC;
-}
-EXPORT_SYMBOL_GPL(getnstimeofday);
#endif

void getnstimestamp(struct timespec *ts)


Regards,
Ulrich


2006-02-13 10:12:36

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.6.15:kernel/time.c: The Nanosecond and code duplication

"Ulrich Windl" <[email protected]> writes:

> but there's no POSIX like syscall interface
> (clock_getres, clock_gettime, clock_settime) yet.

% grep clock include/asm-x86_64/unistd.h
#define __NR_clock_settime 227
__SYSCALL(__NR_clock_settime, sys_clock_settime)
#define __NR_clock_gettime 228
__SYSCALL(__NR_clock_gettime, sys_clock_gettime)
#define __NR_clock_getres 229
__SYSCALL(__NR_clock_getres, sys_clock_getres)
#define __NR_clock_nanosleep 230
__SYSCALL(__NR_clock_nanosleep, sys_clock_nanosleep)

Has been available for quite some time.

However the calls are currently slower than gettimeofday and also
don't use nanoseconds internally in all cases (depends on the architecture),
but still microseconds. But I'm not sure it matters that much
because the underlying timers are often not better than microseconds
anyways and with nanoseconds you start to time even the inherent system
call latency.



-Andi

2006-02-13 11:47:41

by Roman Zippel

[permalink] [raw]
Subject: Re: 2.6.15:kernel/time.c: The Nanosecond and code duplication

Hi,

On Mon, 13 Feb 2006, Ulrich Windl wrote:

> I'm working on an integration of current NTP kernel algorithms for Linux 2.6.

Ulrich, do you know of my patches at http://www.xs4all.nl/~zippel/ntp/patches-2.6.15-rc6-git2/ ?
I posted them already to lkml. They do this already (at least for the non
pps parts).

bye, Roman

2006-02-13 14:49:16

by Ulrich Windl

[permalink] [raw]
Subject: Re: 2.6.15:kernel/time.c: The Nanosecond and code duplication

On 13 Feb 2006 at 11:12, Andi Kleen wrote:

> "Ulrich Windl" <[email protected]> writes:
>
> > but there's no POSIX like syscall interface
> > (clock_getres, clock_gettime, clock_settime) yet.
>
> % grep clock include/asm-x86_64/unistd.h
> #define __NR_clock_settime 227
> __SYSCALL(__NR_clock_settime, sys_clock_settime)
> #define __NR_clock_gettime 228
> __SYSCALL(__NR_clock_gettime, sys_clock_gettime)
> #define __NR_clock_getres 229
> __SYSCALL(__NR_clock_getres, sys_clock_getres)
> #define __NR_clock_nanosleep 230
> __SYSCALL(__NR_clock_nanosleep, sys_clock_nanosleep)
>
> Has been available for quite some time.
>
> However the calls are currently slower than gettimeofday and also
> don't use nanoseconds internally in all cases (depends on the architecture),
> but still microseconds. But I'm not sure it matters that much
> because the underlying timers are often not better than microseconds
> anyways and with nanoseconds you start to time even the inherent system
> call latency.

Andi,

thanks! I must have been overlooking the implementation of those. Actually when
you want to have nanoseconds even if the get_offset() just returns microsecond
granularity?: If you mathematically correct the clock with nanosecond accuracy (or
even less). With one model you'll see gradual time change, in the other case
you'll see jumps of 1000ns. OK, the clock might jump by 100ns for other reasons,
but currently the clock is so amazingly stable that I hardly believe the results
I've measured (but that's quite another topic).

I'm fully aware that nanosecond resolution will give us peace for the next years.
However my 700 MHz Pentium III is already as low as 1?s of jitter, so in theory
the nanosecond may be worth it (e.g. getting a better estimate of the actual
jitter).

Regards,
Ulrich

2006-02-13 21:11:22

by Christoph Lameter

[permalink] [raw]
Subject: Re: 2.6.15:kernel/time.c: The Nanosecond and code duplication

On Mon, 13 Feb 2006, Ulrich Windl wrote:

> There's a hacked-on getnstimeofday() which, what I discovered doesn't actually
> pass along the nanosecond resolution of xtime. It does:

This is the fall back function for arches without nanosecond
resolution....

> The proper solution most likely is to define POSIX compatible routines with
> nanosecond resolution, and then define the microsecond-resolution from those, and
> not the other way round.

Right.

> -#ifdef CONFIG_TIME_INTERPOLATION
> +/* get system time with nanosecond accuracy */
> void getnstimeofday (struct timespec *tv)
> {
> - unsigned long seq,sec,nsec;
> -
> + unsigned long seq, nsec, sec, offset;
> do {
> seq = read_seqbegin(&xtime_lock);
> +#ifdef CONFIG_TIME_INTERPOLATION
> + offset = time_interpolator_get_offset();
> +#else
> + offset = 0;
> +#endif
> sec = xtime.tv_sec;
> - nsec = xtime.tv_nsec+time_interpolator_get_offset();
> + nsec = xtime.tv_nsec + offset;
> } while (unlikely(read_seqretry(&xtime_lock, seq)));
>
> +#ifdef CONFIG_TIME_INTERPOLATION
> while (unlikely(nsec >= NSEC_PER_SEC)) {
> nsec -= NSEC_PER_SEC;
> ++sec;
> }
> +#endif
> tv->tv_sec = sec;
> tv->tv_nsec = nsec;
> }
> EXPORT_SYMBOL_GPL(getnstimeofday);

Looks okay.

> +#ifdef CONFIG_TIME_INTERPOLATION
> +/* this is a mess: there are also architecture-dependent ``do_gettimeofday()''
> + * and ``do_settimeofday()''
> + */

Yes, we would like to get rid of the arch specific
do_get/settimeofday() in the future.

2006-02-14 06:58:10

by Ulrich Windl

[permalink] [raw]
Subject: Re: 2.6.15:kernel/time.c: The Nanosecond and code duplication

On 13 Feb 2006 at 13:11, Christoph Lameter wrote:

> On Mon, 13 Feb 2006, Ulrich Windl wrote:
>
> > There's a hacked-on getnstimeofday() which, what I discovered doesn't actually
> > pass along the nanosecond resolution of xtime. It does:
>
> This is the fall back function for arches without nanosecond
> resolution....

Like the i386 family? Having seen some more of the code, I found that the posix-
timers.c also has it's own family of time routines (plus routines that seem quite
hard to use inside the kernel, so I added just another wrapper). I really think
these are too many functions all dealing with getting the current time. I really
think that nowadays all lower resolution clocks should be derived from the POSIX
time routines (I'm talking about the concept, not a particular implementation).

>
> > The proper solution most likely is to define POSIX compatible routines with
> > nanosecond resolution, and then define the microsecond-resolution from those, and
> > not the other way round.
>
> Right.

;-)

Regards,
Ulrich