2004-09-01 23:21:53

by Christoph Lameter

[permalink] [raw]
Subject: Re: [RFC] New timeofday implementation proposal

On Tue, 17 Aug 2004, john stultz wrote:

> o Consolidates a large amount of code:
> Allows for shared times source implementations, such as: i386, x86-64
> and ia64 all use HPET, i386 and x86-64 both have ACPI PM timers, and
> i386 and ia64 both have cyclone counters. Time sources are just drivers!
> Also work for user space gettimeofday implementations will be able to be
> shared across all arches (assuming the hardware time source can be
> safely accessed from user space).

What about a hardware time source that can be safely accessed with a fast
system call (f.e. via epc on IA64)? My tests indicate that such an
implementation is comparable to a user space memory mapped solution. The
user space memory mapping might generate complexities. Especially since
the page mapped for a memory mapped timer may allow access to hardware
information that should not be exposed to user space.

The time interpolator patches that I posted a while back provide a
generic interface to timer registers / values which may be useful to what
you are trying to accomplish. The C code for that patch is platform
independent but there is also an asm fast path that is specific to IA64.
Other arches could develop similar fastpaths.

> o Uses nanoseconds as the kernel's base time unit.
> Rather then doing ugly manipulations to timevals or timespecs, this
> simplifies math, and gives us plenty of room to grow (64bits of
> nanoseconds ~= 584 years).

The nanoseconds patch that was accepted into 2.6.9-rc1 does do that
partially by providing a getnstimeofday and centralizing the instances
where microseconds are multiplied by 1000 get to nanoseconds.

> o Have to convert back to time_val for syscall interface

This is mostly covered by gettimeofday()

> o ntp_scale(ns): scales ns by NTP scaling factor
> - costly, but correct.

May we would need 128bit arithmetic to increase the accurary of the
scaling?

> o Some arches (arm, for example) do not have high res timing hardware
> - In this case we can have a "jiffies" timesource
> - cyc2ns(x) = x*(NSEC_PER_SEC/HZ)
> - doesn't work for tickless systems

Most arches have already high res time sources. I think we just need to
make proper use of them.

> o suspend/resume
> - need to pause and restart the timesource reads
> - we don't want a gigantic or negative offset!

Some intelligent timer needs to survive the suspend. Timers may need an
attribute to show if they continue counting through suspense/resume etc
(various power conditions etc....)

Hope this helps ....