On Fri, 2006-03-03 at 13:15 -0800, Andrew Morton wrote:
> So we can do this, but the question is do we _want_ to do it? If the arch
> maintainers can look at it from a high level and say "yup, I can use that
> and it'll improve/simplify/speedup/reduce my code" then yes, it's worth
> making the effort.
So, with 2.6.16 closing out, its time to decide if the
following generic timekeeping code is ready to go into 2.6.17. I know
this is pretty dull code, however if any of you arch maintainers could
give it a quick once-over and let Andrew and me know if you have any
positive or negative reaction to this code, I would really appreciate
it.
(To avoid spamming important and busy arch maintainers, I've only CC'ed
them on this announcement. The full patchset can be found on lkml.)
Please note, that there are still improvements to be made. Roman, for
instance, has made some good suggestions that I'm working to implement.
However, I feel the code as it stands _is_ ready for inclusion, as it
resolves a number of issues that users are seeing, and has been in the
-rt and -mm trees for some time with few problem reports.
Summary:
This patchset provides a generic timekeeping subsystem that is
independent of the timer interrupt. This allows for robust and correct
behavior in cases of late or lost ticks, avoids interpolation errors,
reduces duplication in arch specific code, and allows or assists future
changes such as high-res timers, dynamic ticks, or realtime preemption.
Additionally, it provides finer nanosecond resolution values to the
clock_gettime functions.
Why do we need this?
Correctness: With the current tick-based interpolated timekeeping used
in most architectures, there is the possibility for time inconsistencies
caused by NTP adjustment, less then perfect calibration of the highres
time hardware, or late or missed ticks. Over time, a number of fixes
have been added trying to minimize the effect of these errors, but as
hardware gets faster, the observable window for time inconsistencies
increases and the fixes are less effective. Further, correct
timekeeping becomes even more difficult with future features like
dynamic ticks or with realtime preemption when using the existing
timekeeping code.
Consolidation: We have quite a bit of duplication between the arches in
their timekeeping code. Many have chosen slightly different paths which
can make it difficult to resolve bugs or enable new features that could
be more generic. Additionally since the interaction between arch
generic and arch specific code is not well specified, the code has
become quite fragile as timekeeping fixes may or may not affect other
arches and may not affect them in the same way.
Its a bit outdated, but my 2005 OLS presentation (basically more of the
same, but with a few pictures) on this code can be found here:
http://sr71.net/~jstultz/tod/ols-presentation-final.pdf
Overview of the code:
This patchset that I'm submitting breaks up into a few basic chunks:
1) Minimal rework of the NTP code to isolate it so that it does not
directly change the timekeeping variables as a side effect. Provides
accessors to the NTP state machine that allows the timekeeping core to
modify time accordingly.
2) Introduces the clocksource abstraction, which provides a generic
method to register, select and use hardware specific clocksource
drivers.
3) Introduces the generic timekeeping core that uses the clocksource
infrastructure instead of the timer interrupt to maintain time and
provides generic accessors to the system time and wall time. Also uses
the NTP changes to scale time smoothly between ticks.
4) Patches to convert the i386 arch to use the timekeeping core
5) Finally a patch to provide clocksource drivers for i386 hardware
I also have patches that provide #4 and #5 for powerpc and x86-64,
however they are not ready for submission, but you can find them at the
link below.
Changes since the B19 release:
o Synced w/ changes from -mm
o jiffies/jiffies_64 and related locking fix (from Atsushi Nemoto)
o Dropped acpi_pm_buggy paranoia (Suggested by Adrian Bunk)
o Re-remove duplicate leapsecond code which accidentally got dropped.
Outstanding issues:
o No reported problem from testing in -mm
Still on my TODO list:
o Possible merge with Roman's NTP changes
o Continue working on improved ntp error accounting patch
o Squish any bugs that pop up from testing
o Clean and split up x86-64 and powerpc patches
o ppc, s390, arm, ia64, alpha, sparc, sparc64, etc work
The patchset applies against the current 2.6.16-rc5-git.
The complete patchset (including unsubmitted powerpc and x86-64 code)
can be found here:
http://sr71.net/~jstultz/tod/
I'd like to thank the following people who have contributed ideas,
criticism, testing and code that has helped shape this work:
George Anzinger, Nish Aravamudan, Max Asbock, Serge Belyshev,
Dominik Brodowski, Adrian Bunk, Thomas Gleixner, Darren Hart, Christoph
Lameter, Matt Mackal, Keith Mannthey, Ingo Molnar, Andrew Morton, Paul
Munt, Martin Schwidefsky, Frank Sorenson, Ulrich Windl, Jonathan
Woithe, Darrick Wong, Roman Zippel and any others whom I've
accidentally left off this list.
thanks
-john
* john stultz <[email protected]> wrote:
> Summary:
> This patchset provides a generic timekeeping subsystem that is
> independent of the timer interrupt. This allows for robust and correct
> behavior in cases of late or lost ticks, avoids interpolation errors,
> reduces duplication in arch specific code, and allows or assists
> future changes such as high-res timers, dynamic ticks, or realtime
> preemption. Additionally, it provides finer nanosecond resolution
> values to the clock_gettime functions.
>
> Why do we need this?
i'd like to add the following to the list of reasons as well: the
current -hrt patch-queue (high resolution timers, which enables true
microsecond-level resolution for POSIX timers and nanosleep()), which is
working pretty well in the -rt tree and implements an important feature
for Linux, is based on the generic timekeeping subsystem as well.
Ingo