2010-11-15 10:35:32

by Linus Walleij

[permalink] [raw]
Subject: [PATCH] clocksource: document some basic concepts

This adds some documentation about clock sources and the weak
sched_clock() function that answers questions that repeatedly
arise on the mailing lists.

Cc: Thomas Gleixner <[email protected]>
Cc: Nicolas Pitre <[email protected]>
Cc: Colin Cross <[email protected]>
Cc: John Stultz <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Rabin Vincent <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
---
Documentation/timers/00-INDEX | 2 +
Documentation/timers/clocksource.txt | 106 ++++++++++++++++++++++++++++++++++
2 files changed, 108 insertions(+), 0 deletions(-)
create mode 100644 Documentation/timers/clocksource.txt

diff --git a/Documentation/timers/00-INDEX b/Documentation/timers/00-INDEX
index a9248da..fb88065 100644
--- a/Documentation/timers/00-INDEX
+++ b/Documentation/timers/00-INDEX
@@ -1,5 +1,7 @@
00-INDEX
- this file
+clocksource.txt
+ - Clock sources and sched_clock() notes
highres.txt
- High resolution timers and dynamic ticks design notes
hpet.txt
diff --git a/Documentation/timers/clocksource.txt b/Documentation/timers/clocksource.txt
new file mode 100644
index 0000000..cf4ab9e
--- /dev/null
+++ b/Documentation/timers/clocksource.txt
@@ -0,0 +1,106 @@
+Clock sources and sched_clock()
+-------------------------------
+
+If you grep through the kernel source you will find a number of architecture-
+specific implementations of clock sources and several likewise architecture-
+specific overrides of the sched_clock() function.
+
+To provide timekeeping for your platform, the clock source provides
+the basic timeline, whereas clock events shoot interrupts on certain points
+on this timeline, providing facilities such as high-resolution timers.
+sched_clock() is used for scheduling and timestamping.
+
+
+Clock sources
+-------------
+
+The purpose of the clock source is to provide a timeline for the system that
+tells you where you are in time. For example issuing the command 'date' on
+a Linux system will eventually read the clock source to determine exactly
+what time it is.
+
+Typically the clock source is a monotonic, atomic counter which will provide
+n bits which count from 0 to (2^n-1) and then wraps around to 0 and start over.
+
+The clock source shall have as high resolution as possible, and shall be as
+stable and correct as possible as compared to a real-world wall clock. It
+should not move unpredictably back and forth in time or miss a few cycles
+here and there.
+
+It must be immune the kind of effects that occur in hardware where e.g. the
+counter register is read in two phases on the bus lowest 16 bits first and
+the higher 16 bits in a second bus cycle with the counter bits potentially
+being updated inbetween leading to the risk of very strange values from the
+counter.
+
+When the wall-clock accuracy of the clock source isn't satisfactory, there
+are various quirks and layers in the timekeeping code for e.g. synchronizing
+the user-visible time to RTC clocks in the system or against networked time
+servers using NTP, but all they do is basically to update an offset against
+the clock source, which provides the fundamental timeline for the system.
+These measures does not affect the clock source per se.
+
+The clock source struct shall provide means to translate the provided counter
+into a rough nanosecond value as an unsigned long long (unsigned 64 bit) number.
+Since this operation may be invoked very often doing this in a strict
+mathematical sense is not desireable: instead the number is taken as close as
+possible to a nanosecond value using only the arithmetic operations
+mult and shift, so in clocksource_cyc2ns() you find:
+
+ ns ~= (clocksource * mult) >> shift
+
+You will find a number of helper functions in the clock source code intended
+to aid in providing these mult and shift values, such as
+clocksource_khz2mult(), clocksource_hz2mult() that help determinining the
+mult factor from a fixed shift, and clocksource_calc_mult_shift() and
+clocksource_register_hz() which will help out assigning both shift and mult
+factors using the frequency of the clock source and desirable minimum idle
+time as the only input. In the past, the timekeeping authors would come up with
+these values by hand, which is why you will sometimes find hard-coded shift
+and mult values in the code.
+
+Since a 32 bit counter at say 100 MHz will wrap around to zero after some 43
+seconds, the code handling the clock source will have to compensate for this.
+That is the reason to why the clock source struct also contains a 'mask'
+member telling how many bits of the source are valid. This way the timekeeping
+code knows when the counter will wrap around and can insert the necessary
+compensation code on both sides of the wrap point so that the system timeline
+remains monotonic. Note that the clocksource_cyc2ns() function will not
+compensate for wrap-arounds: it will return the rough number of nanoseconds
+since the last wrap-around.
+
+You will notice that the clock event device code is based on the same basic
+idea about translating counters to nanoseconds using mult and shift
+arithmetics, and you find the same family of helper functions again for
+assigning these values. The clock event driver does not need a 'mask'
+attribute however: the system will not try to plan events beyond the time
+horizon of the clock event.
+
+
+sched_clock()
+-------------
+
+In addition to the clock sources and clock events there is a special weak
+function in the kernel called sched_clock(). This function shall return the
+number of nanoseconds since the system was started. An architecture may or
+may not provide an implementation of sched_clock() on its own.
+
+As the name suggests, sched_clock() is used for scheduling the system,
+determining the absolute timeslice for a certain process in the CFS scheduler
+for example. It is also used for printk timestamps when you have selected to
+include time information in printk for things like bootcharts.
+
+Compared to clock sources, sched_clock() has to be very fast: it is called
+much more often, especially by the scheduler. If you have to do trade-offs
+between accuracy compared to the clock source, you may sacrifice accuracy
+for speed in sched_clock(). It however require the same basic characteristics
+as the clock source, i.e. it has to be monotonic.
+
+The sched_clock() function may wrap only on unsigned long long boundaries,
+i.e. after 64 bits. Since this is a nanosecond value this will mean it wraps
+after circa 585 years. (For most practical systems this means "never".)
+
+If an architecture does not provide its own implementation of this function,
+it will fall back to using jiffies, making its maximum resolution 1/HZ of the
+jiffy frequency for the architecture. This will affect scheduling accuracy
+and will likely show up in system benchmarks.
--
1.6.3.3


2010-11-15 10:48:19

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] clocksource: document some basic concepts

On Mon, 2010-11-15 at 11:33 +0100, Linus Walleij wrote:
> +sched_clock()
> +-------------
> +
> +In addition to the clock sources and clock events there is a special weak
> +function in the kernel called sched_clock(). This function shall return the
> +number of nanoseconds since the system was started. An architecture may or
> +may not provide an implementation of sched_clock() on its own.
> +
> +As the name suggests, sched_clock() is used for scheduling the system,
> +determining the absolute timeslice for a certain process in the CFS scheduler
> +for example. It is also used for printk timestamps when you have selected to
> +include time information in printk for things like bootcharts.
> +
> +Compared to clock sources, sched_clock() has to be very fast: it is called
> +much more often, especially by the scheduler. If you have to do trade-offs
> +between accuracy compared to the clock source, you may sacrifice accuracy
> +for speed in sched_clock(). It however require the same basic characteristics
> +as the clock source, i.e. it has to be monotonic.

Not so, we prefer it be synchronized and monotonic, but we don't require
so, see below.

> +The sched_clock() function may wrap only on unsigned long long boundaries,
> +i.e. after 64 bits. Since this is a nanosecond value this will mean it wraps
> +after circa 585 years. (For most practical systems this means "never".)

Currently true, John Stultz was going to look into ammending this by
teaching the kernel/sched_clock.c bits about early wraps (and a way for
architectures to specify this)

#define SCHED_CLOCK_WRAP_BITS 48

...

#ifdef SCHED_CLOCK_WRAP_BITS
/* handle short wraps */
#endif

foo for wrap_min/wrap_max and "delta = now - scd->tick_raw" like things
might work.

> +If an architecture does not provide its own implementation of this function,
> +it will fall back to using jiffies, making its maximum resolution 1/HZ of the
> +jiffy frequency for the architecture. This will affect scheduling accuracy
> +and will likely show up in system benchmarks.

sched_clock() need not be synchronized between CPUs, nor even be
monotonic, we prefer a fast high res clock over a slow one,
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK provides infrastructure to sanitize the
output of sched_clock().

[ of course we prefer a fast and synchronized clock, but we take fast
over synchronized ]

sched_clock() requires local IRQs to be disabled.

Therefore, sched_clock() shall not be used, see kernel/sched_clock.c for
detail and alternative interfaces.

2010-11-15 10:50:37

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] clocksource: document some basic concepts

On Mon, 2010-11-15 at 11:48 +0100, Peter Zijlstra wrote:
>
> Therefore, sched_clock() shall not be used, see kernel/sched_clock.c for
> detail and alternative interfaces.

we should probably rename the thing to __arch_sched_clock() and migrate
people to the kernel/sched_clock.c interfaces.

2010-11-15 16:34:09

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH] clocksource: document some basic concepts

On Mon, 15 Nov 2010 11:33:48 +0100 Linus Walleij wrote:

> This adds some documentation about clock sources and the weak
> sched_clock() function that answers questions that repeatedly
> arise on the mailing lists.
>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Nicolas Pitre <[email protected]>
> Cc: Colin Cross <[email protected]>
> Cc: John Stultz <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Rabin Vincent <[email protected]>
> Signed-off-by: Linus Walleij <[email protected]>
> ---
> Documentation/timers/00-INDEX | 2 +
> Documentation/timers/clocksource.txt | 106 ++++++++++++++++++++++++++++++++++
> 2 files changed, 108 insertions(+), 0 deletions(-)
> create mode 100644 Documentation/timers/clocksource.txt
>
> diff --git a/Documentation/timers/00-INDEX b/Documentation/timers/00-INDEX
> index a9248da..fb88065 100644
> --- a/Documentation/timers/00-INDEX
> +++ b/Documentation/timers/00-INDEX
> @@ -1,5 +1,7 @@
> 00-INDEX
> - this file
> +clocksource.txt
> + - Clock sources and sched_clock() notes
> highres.txt
> - High resolution timers and dynamic ticks design notes
> hpet.txt
> diff --git a/Documentation/timers/clocksource.txt b/Documentation/timers/clocksource.txt
> new file mode 100644
> index 0000000..cf4ab9e
> --- /dev/null
> +++ b/Documentation/timers/clocksource.txt
> @@ -0,0 +1,106 @@
> +Clock sources and sched_clock()
> +-------------------------------
> +
> +If you grep through the kernel source you will find a number of architecture-
> +specific implementations of clock sources and several likewise architecture-
> +specific overrides of the sched_clock() function.
> +
> +To provide timekeeping for your platform, the clock source provides
> +the basic timeline, whereas clock events shoot interrupts on certain points
> +on this timeline, providing facilities such as high-resolution timers.
> +sched_clock() is used for scheduling and timestamping.
> +
> +
> +Clock sources
> +-------------
> +
> +The purpose of the clock source is to provide a timeline for the system that
> +tells you where you are in time. For example issuing the command 'date' on
> +a Linux system will eventually read the clock source to determine exactly
> +what time it is.
> +
> +Typically the clock source is a monotonic, atomic counter which will provide
> +n bits which count from 0 to (2^n-1) and then wraps around to 0 and start over.
> +
> +The clock source shall have as high resolution as possible, and shall be as
> +stable and correct as possible as compared to a real-world wall clock. It
> +should not move unpredictably back and forth in time or miss a few cycles
> +here and there.
> +
> +It must be immune the kind of effects that occur in hardware where e.g. the

immune from the

> +counter register is read in two phases on the bus lowest 16 bits first and

on the bus (lowest

> +the higher 16 bits in a second bus cycle with the counter bits potentially

bus cycle) with

> +being updated inbetween leading to the risk of very strange values from the
> +counter.
> +
> +When the wall-clock accuracy of the clock source isn't satisfactory, there
> +are various quirks and layers in the timekeeping code for e.g. synchronizing
> +the user-visible time to RTC clocks in the system or against networked time
> +servers using NTP, but all they do is basically to update an offset against
> +the clock source, which provides the fundamental timeline for the system.
> +These measures does not affect the clock source per se.
> +
> +The clock source struct shall provide means to translate the provided counter
> +into a rough nanosecond value as an unsigned long long (unsigned 64 bit) number.

64-bit)

> +Since this operation may be invoked very often doing this in a strict
> +mathematical sense is not desireable: instead the number is taken as close as

desirable:

> +possible to a nanosecond value using only the arithmetic operations
> +mult and shift, so in clocksource_cyc2ns() you find:
> +
> + ns ~= (clocksource * mult) >> shift
> +
> +You will find a number of helper functions in the clock source code intended
> +to aid in providing these mult and shift values, such as
> +clocksource_khz2mult(), clocksource_hz2mult() that help determinining the

that help determine

> +mult factor from a fixed shift, and clocksource_calc_mult_shift() and
> +clocksource_register_hz() which will help out assigning both shift and mult
> +factors using the frequency of the clock source and desirable minimum idle
> +time as the only input. In the past, the timekeeping authors would come up with
> +these values by hand, which is why you will sometimes find hard-coded shift
> +and mult values in the code.
> +
> +Since a 32 bit counter at say 100 MHz will wrap around to zero after some 43

32-bit

> +seconds, the code handling the clock source will have to compensate for this.
> +That is the reason to why the clock source struct also contains a 'mask'
> +member telling how many bits of the source are valid. This way the timekeeping
> +code knows when the counter will wrap around and can insert the necessary
> +compensation code on both sides of the wrap point so that the system timeline
> +remains monotonic. Note that the clocksource_cyc2ns() function will not
> +compensate for wrap-arounds: it will return the rough number of nanoseconds
> +since the last wrap-around.
> +
> +You will notice that the clock event device code is based on the same basic
> +idea about translating counters to nanoseconds using mult and shift
> +arithmetics, and you find the same family of helper functions again for
> +assigning these values. The clock event driver does not need a 'mask'
> +attribute however: the system will not try to plan events beyond the time
> +horizon of the clock event.
> +
> +
> +sched_clock()
> +-------------
> +
> +In addition to the clock sources and clock events there is a special weak
> +function in the kernel called sched_clock(). This function shall return the
> +number of nanoseconds since the system was started. An architecture may or
> +may not provide an implementation of sched_clock() on its own.
> +
> +As the name suggests, sched_clock() is used for scheduling the system,
> +determining the absolute timeslice for a certain process in the CFS scheduler
> +for example. It is also used for printk timestamps when you have selected to
> +include time information in printk for things like bootcharts.
> +
> +Compared to clock sources, sched_clock() has to be very fast: it is called
> +much more often, especially by the scheduler. If you have to do trade-offs
> +between accuracy compared to the clock source, you may sacrifice accuracy
> +for speed in sched_clock(). It however require the same basic characteristics

requires

> +as the clock source, i.e. it has to be monotonic.
> +
> +The sched_clock() function may wrap only on unsigned long long boundaries,
> +i.e. after 64 bits. Since this is a nanosecond value this will mean it wraps
> +after circa 585 years. (For most practical systems this means "never".)
> +
> +If an architecture does not provide its own implementation of this function,
> +it will fall back to using jiffies, making its maximum resolution 1/HZ of the
> +jiffy frequency for the architecture. This will affect scheduling accuracy
> +and will likely show up in system benchmarks.
> --


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

2010-11-15 19:45:57

by john stultz

[permalink] [raw]
Subject: Re: [PATCH] clocksource: document some basic concepts

On Mon, 2010-11-15 at 11:33 +0100, Linus Walleij wrote:
> This adds some documentation about clock sources and the weak
> sched_clock() function that answers questions that repeatedly
> arise on the mailing lists.
>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Nicolas Pitre <[email protected]>
> Cc: Colin Cross <[email protected]>
> Cc: John Stultz <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Rabin Vincent <[email protected]>
> Signed-off-by: Linus Walleij <[email protected]>
> ---
> Documentation/timers/00-INDEX | 2 +
> Documentation/timers/clocksource.txt | 106 ++++++++++++++++++++++++++++++++++
> 2 files changed, 108 insertions(+), 0 deletions(-)
> create mode 100644 Documentation/timers/clocksource.txt
>
> diff --git a/Documentation/timers/00-INDEX b/Documentation/timers/00-INDEX
> index a9248da..fb88065 100644
> --- a/Documentation/timers/00-INDEX
> +++ b/Documentation/timers/00-INDEX
> @@ -1,5 +1,7 @@
> 00-INDEX
> - this file
> +clocksource.txt
> + - Clock sources and sched_clock() notes
> highres.txt
> - High resolution timers and dynamic ticks design notes
> hpet.txt
> diff --git a/Documentation/timers/clocksource.txt b/Documentation/timers/clocksource.txt
> new file mode 100644
> index 0000000..cf4ab9e
> --- /dev/null
> +++ b/Documentation/timers/clocksource.txt
> @@ -0,0 +1,106 @@
> +Clock sources and sched_clock()
> +-------------------------------

Thanks for writing this up!

I do worry a little that by talking about the two subjects in the same
document, it creates an impression that the two infrastructures are
conceptually linked (even though this is mostly about the differences
between them).

> +If you grep through the kernel source you will find a number of architecture-
> +specific implementations of clock sources and several likewise architecture-
> +specific overrides of the sched_clock() function.
> +
> +To provide timekeeping for your platform, the clock source provides
> +the basic timeline, whereas clock events shoot interrupts on certain points
> +on this timeline, providing facilities such as high-resolution timers.
> +sched_clock() is used for scheduling and timestamping.
> +
> +
> +Clock sources
> +-------------
> +
> +The purpose of the clock source is to provide a timeline for the system that
> +tells you where you are in time. For example issuing the command 'date' on
> +a Linux system will eventually read the clock source to determine exactly
> +what time it is.
> +
> +Typically the clock source is a monotonic, atomic counter which will provide
> +n bits which count from 0 to (2^n-1) and then wraps around to 0 and start over.
> +
> +The clock source shall have as high resolution as possible, and shall be as
> +stable and correct as possible as compared to a real-world wall clock. It
> +should not move unpredictably back and forth in time or miss a few cycles
> +here and there.
> +
> +It must be immune the kind of effects that occur in hardware where e.g. the
> +counter register is read in two phases on the bus lowest 16 bits first and
> +the higher 16 bits in a second bus cycle with the counter bits potentially
> +being updated inbetween leading to the risk of very strange values from the
> +counter.
> +
> +When the wall-clock accuracy of the clock source isn't satisfactory, there
> +are various quirks and layers in the timekeeping code for e.g. synchronizing
> +the user-visible time to RTC clocks in the system or against networked time
> +servers using NTP, but all they do is basically to update an offset against
> +the clock source, which provides the fundamental timeline for the system.
> +These measures does not affect the clock source per se.

Its not so much updating an offset, but more adjusting the frequency to
steer the clocksource to NTP time.

Also while syncing the RTC is something that the timekeeping code does,
its not really connected to the clocksource code in particular.


> +
> +The clock source struct shall provide means to translate the provided counter
> +into a rough nanosecond value as an unsigned long long (unsigned 64 bit) number.
> +Since this operation may be invoked very often doing this in a strict
> +mathematical sense is not desireable: instead the number is taken as close as
> +possible to a nanosecond value using only the arithmetic operations
> +mult and shift, so in clocksource_cyc2ns() you find:
> +
> + ns ~= (clocksource * mult) >> shift
> +
> +You will find a number of helper functions in the clock source code intended
> +to aid in providing these mult and shift values, such as
> +clocksource_khz2mult(), clocksource_hz2mult() that help determinining the
> +mult factor from a fixed shift, and clocksource_calc_mult_shift() and
> +clocksource_register_hz() which will help out assigning both shift and mult
> +factors using the frequency of the clock source and desirable minimum idle
> +time as the only input. In the past, the timekeeping authors would come up with
> +these values by hand, which is why you will sometimes find hard-coded shift
> +and mult values in the code.

Yea. I'm working on cleaning these out, so I'd recommend just pointing
to using clocksource_register_hz/khz(), to have a proper mult-shift pair
calculated out for you. The explanation about the hard-coded bit from
the past is good while we're in transition.

> +Since a 32 bit counter at say 100 MHz will wrap around to zero after some 43
> +seconds, the code handling the clock source will have to compensate for this.
> +That is the reason to why the clock source struct also contains a 'mask'
> +member telling how many bits of the source are valid. This way the timekeeping
> +code knows when the counter will wrap around and can insert the necessary
> +compensation code on both sides of the wrap point so that the system timeline
> +remains monotonic. Note that the clocksource_cyc2ns() function will not
> +compensate for wrap-arounds: it will return the rough number of nanoseconds
> +since the last wrap-around.

Hrm. There are some more non-obvious conditions on this. In fact, for
clocksources that wrap at longer periods, you may hit an multiplication
overflows before the wrap boundary.

I'm starting to feel like clocksource_cyc2ns() should be internalized to
the timekeeping code so its subtle limitations aren't accidentally
tripped over, if its incorrectly re-used for some other purpose.

In fact, as with the clocksource_register_hz/khz, I'm thinking we should
move more towards internalizing most of the complex bits of the
clocksource structure. I'm hoping a read(), freq_hz/khz value, rating
and flags would be all that's needed, hopefully simplifying things for
clocksource writers, and reducing the chance folks might get something
wrong.

thanks
-john

2010-11-15 19:48:57

by john stultz

[permalink] [raw]
Subject: Re: [PATCH] clocksource: document some basic concepts

On Mon, 2010-11-15 at 11:48 +0100, Peter Zijlstra wrote:
> On Mon, 2010-11-15 at 11:33 +0100, Linus Walleij wrote:
> > +sched_clock()
> > +-------------
> > +
> > +In addition to the clock sources and clock events there is a special weak
> > +function in the kernel called sched_clock(). This function shall return the
> > +number of nanoseconds since the system was started. An architecture may or
> > +may not provide an implementation of sched_clock() on its own.
> > +
> > +As the name suggests, sched_clock() is used for scheduling the system,
> > +determining the absolute timeslice for a certain process in the CFS scheduler
> > +for example. It is also used for printk timestamps when you have selected to
> > +include time information in printk for things like bootcharts.
> > +
> > +Compared to clock sources, sched_clock() has to be very fast: it is called
> > +much more often, especially by the scheduler. If you have to do trade-offs
> > +between accuracy compared to the clock source, you may sacrifice accuracy
> > +for speed in sched_clock(). It however require the same basic characteristics
> > +as the clock source, i.e. it has to be monotonic.
>
> Not so, we prefer it be synchronized and monotonic, but we don't require
> so, see below.
>
> > +The sched_clock() function may wrap only on unsigned long long boundaries,
> > +i.e. after 64 bits. Since this is a nanosecond value this will mean it wraps
> > +after circa 585 years. (For most practical systems this means "never".)
>
> Currently true, John Stultz was going to look into ammending this by
> teaching the kernel/sched_clock.c bits about early wraps (and a way for
> architectures to specify this)

I'd like to, although at the moment I don't have much space on my plate
to do this, so in the mean time, if someone has time and interest into
looking at this, ping me and I can lay out the basics of what likely
should be done.

thanks
-john

2010-11-15 20:06:31

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [PATCH] clocksource: document some basic concepts

On Mon, 15 Nov 2010, Peter Zijlstra wrote:

> On Mon, 2010-11-15 at 11:33 +0100, Linus Walleij wrote:
> > +The sched_clock() function may wrap only on unsigned long long boundaries,
> > +i.e. after 64 bits. Since this is a nanosecond value this will mean it wraps
> > +after circa 585 years. (For most practical systems this means "never".)

This is not necessarily the case. Some implementations require a
scaling factor too, making the number of remaining bits smaller than 64.
See arch/arm/mach-pxa/time.c:sched_clock() for example, which has a
maximum range of 208 days. Of course, in practice we don't really care
if sched_clock() wraps each 208 days, unlike for clock-source.

> Currently true, John Stultz was going to look into ammending this by
> teaching the kernel/sched_clock.c bits about early wraps (and a way for
> architectures to specify this)
>
> #define SCHED_CLOCK_WRAP_BITS 48
>
> ...
>
> #ifdef SCHED_CLOCK_WRAP_BITS
> /* handle short wraps */
> #endif

Is this worth supporting? I'd simply use the low 32 bits and extend it
to 63 bits using cnt32_to_63(). If the low 32 bits are wrapping too
fast, then just shifting them down a few positions first should do the
trick. That certainly would have a much faster result.


Nicolas

2010-11-15 21:13:02

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] clocksource: document some basic concepts

On Mon, 2010-11-15 at 15:06 -0500, Nicolas Pitre wrote:
> On Mon, 15 Nov 2010, Peter Zijlstra wrote:
>
> > On Mon, 2010-11-15 at 11:33 +0100, Linus Walleij wrote:
> > > +The sched_clock() function may wrap only on unsigned long long boundaries,
> > > +i.e. after 64 bits. Since this is a nanosecond value this will mean it wraps
> > > +after circa 585 years. (For most practical systems this means "never".)
>
> This is not necessarily the case. Some implementations require a
> scaling factor too, making the number of remaining bits smaller than 64.
> See arch/arm/mach-pxa/time.c:sched_clock() for example, which has a
> maximum range of 208 days. Of course, in practice we don't really care
> if sched_clock() wraps each 208 days, unlike for clock-source.

Right, its like sched_clock() would go backwards and we loose some
precision during that jiffy (assuming the arch uses
HAVE_UNSTABLE_SCHED_CLOCK), nothing too horrible.

> > Currently true, John Stultz was going to look into ammending this by
> > teaching the kernel/sched_clock.c bits about early wraps (and a way for
> > architectures to specify this)
> >
> > #define SCHED_CLOCK_WRAP_BITS 48
> >
> > ...
> >
> > #ifdef SCHED_CLOCK_WRAP_BITS
> > /* handle short wraps */
> > #endif
>
> Is this worth supporting? I'd simply use the low 32 bits and extend it
> to 63 bits using cnt32_to_63(). If the low 32 bits are wrapping too
> fast, then just shifting them down a few positions first should do the
> trick. That certainly would have a much faster result.

Whatever works, dealing with the wrap is only a few shifts.