2012-05-30 18:38:13

by Alan Stern

[permalink] [raw]
Subject: Use of high-res timers

Thomas, Ingo, or anyone else:

My driver needs to time a bunch of events, with roughly millisecond
precision. Up to now it has used old-fashioned timer_lists and the
jiffies counter, but I'm switching over to high-resolution timers and
ktime_get.

This leads to a few questions (these issues don't seem to be addressed
anywhere in Documentation/timers):

Should I be concerned about efficiency? I may well end up
calling ktime_get several times per millisecond; is it fast
enough for this to be okay?

I need timed intervals with reliable lower bounds. Let's say
I call ktime_get twice, maybe once in an interrupt handler and
once in an hrtimer callback (not necessarily on the same CPU).
Some action has to be taken no earlier than 1 ms after the
first call. If the second call returns a value that is at
least 1 ms larger than the first call, is that enough of a
guarantee? If not, how much larger does it have to be?

Which has more overhead: adding and cancelling an hrtimer
several times, or simply letting it expire and returning
immediately from the callback? (I wouldn't be surprised if
there was no good answer.)

Thanks,

Alan Stern


2012-05-31 07:37:03

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Use of high-res timers

On Wed, 30 May 2012, Alan Stern wrote:

> Thomas, Ingo, or anyone else:
>
> My driver needs to time a bunch of events, with roughly millisecond
> precision. Up to now it has used old-fashioned timer_lists and the
> jiffies counter, but I'm switching over to high-resolution timers and
> ktime_get.
>
> This leads to a few questions (these issues don't seem to be addressed
> anywhere in Documentation/timers):
>
> Should I be concerned about efficiency? I may well end up
> calling ktime_get several times per millisecond; is it fast
> enough for this to be okay?

It better is.

> I need timed intervals with reliable lower bounds. Let's say
> I call ktime_get twice, maybe once in an interrupt handler and
> once in an hrtimer callback (not necessarily on the same CPU).
> Some action has to be taken no earlier than 1 ms after the
> first call. If the second call returns a value that is at
> least 1 ms larger than the first call, is that enough of a
> guarantee? If not, how much larger does it have to be?

ktime_get() is precise. Can you explain what you are trying to solve ?

> Which has more overhead: adding and cancelling an hrtimer
> several times, or simply letting it expire and returning
> immediately from the callback? (I wouldn't be surprised if
> there was no good answer.)

There is no really good answer.

Thanks,

tglx

2012-05-31 14:47:13

by Alan Stern

[permalink] [raw]
Subject: Re: Use of high-res timers

On Thu, 31 May 2012, Thomas Gleixner wrote:

> > I need timed intervals with reliable lower bounds. Let's say
> > I call ktime_get twice, maybe once in an interrupt handler and
> > once in an hrtimer callback (not necessarily on the same CPU).
> > Some action has to be taken no earlier than 1 ms after the
> > first call. If the second call returns a value that is at
> > least 1 ms larger than the first call, is that enough of a
> > guarantee? If not, how much larger does it have to be?
>
> ktime_get() is precise. Can you explain what you are trying to solve ?

Here's an example. A hardware device accesses a software data
structure via DMA, and the driver needs to change the data structure.
However, the data can't be updated safely while the device is using it.
Furthermore, we know that the device may continue to access the data
for as long as 1 ms after being told to stop (because of internal
caches and such).

It's okay to wait longer than 1 ms, but we'd like to minimize the wait
time in order to avoid delaying I/O unnecessarily. Therefore:

(1) The driver removes the pointer to the data structure from
the device's DMA list, then calls ktime_get, adds 1 ms, and
stores the result.

(2) The driver waits for while (details are unimportant).

(3) Some time later, the driver calls ktime_get again and compares
the stored value to the new value. If the new value is
smaller, go back to step (2).

(4) Now the driver knows that at least 1 ms has passed since (1),
and therefore any ongoing DMA has finished and the pointer has
been dropped from the device's cache. Thus the device cannot
be doing DMA to the data structure any more, so the data can be
updated safely.

The key here is the assumption in step (4): If the new value from
ktime_get exceeds the stored value then one millisecond of time really
has elapsed. I can imagine this might not hold true if the two calls
to ktime_get were made on different CPUs, or possibly for other
reasons.

So my question is: What value should be stored in step (1) to guarantee
that the assumption is value?

More or less equivalently, what is the relative error between two calls
of ktime_get?

Alan Stern

2012-05-31 15:04:41

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Use of high-res timers

On Thu, 31 May 2012, Alan Stern wrote:
> On Thu, 31 May 2012, Thomas Gleixner wrote:
>
> > > I need timed intervals with reliable lower bounds. Let's say
> > > I call ktime_get twice, maybe once in an interrupt handler and
> > > once in an hrtimer callback (not necessarily on the same CPU).
> > > Some action has to be taken no earlier than 1 ms after the
> > > first call. If the second call returns a value that is at
> > > least 1 ms larger than the first call, is that enough of a
> > > guarantee? If not, how much larger does it have to be?
> >
> > ktime_get() is precise. Can you explain what you are trying to solve ?
>
> Here's an example. A hardware device accesses a software data
> structure via DMA, and the driver needs to change the data structure.
> However, the data can't be updated safely while the device is using it.
> Furthermore, we know that the device may continue to access the data
> for as long as 1 ms after being told to stop (because of internal
> caches and such).

And of course the hardware designers decided that there is no need for
a reliable way to detect that....

> It's okay to wait longer than 1 ms, but we'd like to minimize the wait
> time in order to avoid delaying I/O unnecessarily. Therefore:
>
> (1) The driver removes the pointer to the data structure from
> the device's DMA list, then calls ktime_get, adds 1 ms, and
> stores the result.
>
> (2) The driver waits for while (details are unimportant).
>
> (3) Some time later, the driver calls ktime_get again and compares
> the stored value to the new value. If the new value is
> smaller, go back to step (2).
>
> (4) Now the driver knows that at least 1 ms has passed since (1),
> and therefore any ongoing DMA has finished and the pointer has
> been dropped from the device's cache. Thus the device cannot
> be doing DMA to the data structure any more, so the data can be
> updated safely.
>
> The key here is the assumption in step (4): If the new value from
> ktime_get exceeds the stored value then one millisecond of time really
> has elapsed. I can imagine this might not hold true if the two calls
> to ktime_get were made on different CPUs, or possibly for other
> reasons.

No. ktime_get() is guaranteed to be monotonic across CPUs.

> So my question is: What value should be stored in step (1) to guarantee
> that the assumption is value?
>
> More or less equivalently, what is the relative error between two calls
> of ktime_get?

It just depends on the resolution of the underlying clocksource and NTP
adjustments.

So the only case where you can run into trouble is when the
clocksource is coarse grained, e.g. pure jiffies, where two
consecutive calls can show a 1/HZ delta.

But you really should not worry much about that, except you are aiming
for some stoneage platform. Anything up to date is going to have at
least a 32kHz counter based clocksource. At 32kHz the per clock tick
increment is ~30us, so that's your expected error.

Thanks,

tglx





2012-05-31 15:26:26

by Luming Yu

[permalink] [raw]
Subject: Re: Use of high-res timers

>
> No. ktime_get() is guaranteed to be monotonic across CPUs.
>

Hi,

We probably need a tool to enable people to test it out. I'd like to
know if you would be interested in queuing up a tool
(https://lkml.org/lkml/2012/4/10/282) for 3.5.
I've pinged someone, but not sure they have extra bandwidth or still
have interests in it.

TIA.

2012-05-31 15:27:46

by Alan Stern

[permalink] [raw]
Subject: Re: Use of high-res timers

On Thu, 31 May 2012, Thomas Gleixner wrote:

> > Here's an example. A hardware device accesses a software data
> > structure via DMA, and the driver needs to change the data structure.
> > However, the data can't be updated safely while the device is using it.
> > Furthermore, we know that the device may continue to access the data
> > for as long as 1 ms after being told to stop (because of internal
> > caches and such).
>
> And of course the hardware designers decided that there is no need for
> a reliable way to detect that....

Well, it's not quite that bad. In fact the hardware design does call
for a counter that increments at 8000 Hz; it could be used for this
purpose.

Except... In some implementations, the counter stops at unpredictable
times! It's not reliable; hence this workaround.

> It just depends on the resolution of the underlying clocksource and NTP
> adjustments.
>
> So the only case where you can run into trouble is when the
> clocksource is coarse grained, e.g. pure jiffies, where two
> consecutive calls can show a 1/HZ delta.
>
> But you really should not worry much about that, except you are aiming
> for some stoneage platform. Anything up to date is going to have at
> least a 32kHz counter based clocksource. At 32kHz the per clock tick
> increment is ~30us, so that's your expected error.

Just what I needed to know! Thanks.

Alan Stern