2006-03-26 21:47:44

by Chuck Ebbert

[permalink] [raw]
Subject: [patch] fix delay_tsc (was Re: delay_tsc(): inefficient delay loop (2.6.16-mm1))

In-Reply-To: <[email protected]>

On Fri, 24 Mar 2006 09:22:51 -0800, Ray Lee wrote:
> On 3/24/06, Andreas Mohr <[email protected]> wrote:
> > + loops += bclock;
> [...]
> > - } while ((now-bclock) < loops);
> > + } while (now < loops);
>
> Erm, aren't you introducing an overflow problem here?
>
> if loops is 2^32-1, bclock is 1, the old version would execute the
> proper number of times, the new one will blow out in one tick.

Yes, but the old version has a bug too. I don't have 2.6.16-mm1, but
in 2.6.16 it's in arch/i386/kernel/timers/timer_tsc.c and
arch/x86_64/lib/delay.c:

static void delay_tsc(unsigned long loops)
{
unsigned long bclock, now;

rdtscl(bclock);
do
{
rep_nop();
rdtscl(now);
} while ((now-bclock) < loops);
}

If (loops == 100000) and (bclock == 2^32-1) the loop will terminate
immediately when the low part of the TSC overflows because (now-bclock)
is a large number. That can't be right...

I'm running a system with this applied now. I think there are still
problems if someone uses huge delays, though. What keeps someone from
trying to delay for > 2^31 cycles?


i386 delay_tsc() will truncate delays when the TSC is within 'loops'
of overflow. We must be able to handle TSC overflow both before and
after 'end', i.e. [1], [2] and [3] below.

zero
|
case A (end < start) start [1] | [2] end [3]
|
case B (end > start) start [1] end [2] | [3]

Signed-off-by: Chuck Ebbert <[email protected]>

--- 2.6.16-d2.orig/arch/i386/kernel/timers/timer_tsc.c
+++ 2.6.16-d2/arch/i386/kernel/timers/timer_tsc.c
@@ -170,14 +170,22 @@ unsigned long long sched_clock(void)

static void delay_tsc(unsigned long loops)
{
- unsigned long bclock, now;
+ unsigned long start, end, now;

- rdtscl(bclock);
- do
- {
- rep_nop();
- rdtscl(now);
- } while ((now-bclock) < loops);
+ rdtscl(start);
+ end = start + loops;
+
+ if (unlikely(end < start)) {
+ do {
+ rep_nop();
+ rdtscl(now);
+ } while (now > start || now < end);
+ } else {
+ do {
+ rep_nop();
+ rdtscl(now);
+ } while (now > start && now < end);
+ }
}

#ifdef CONFIG_HPET_TIMER
--
Chuck
"Penguins don't come from next door, they come from the Antarctic!"


2006-03-27 00:19:42

by Andi Kleen

[permalink] [raw]
Subject: Re: [patch] fix delay_tsc (was Re: delay_tsc(): inefficient delay loop (2.6.16-mm1))


> I'm running a system with this applied now. I think there are still
> problems if someone uses huge delays, though. What keeps someone from
> trying to delay for > 2^31 cycles?

You shouldn't. The caller has a compile time check for that. And if you pass
in dynamic values you get what you deserve.

-Andi

2006-03-27 00:42:24

by Ray Lee

[permalink] [raw]
Subject: Re: [patch] fix delay_tsc (was Re: delay_tsc(): inefficient delay loop (2.6.16-mm1))

On 3/26/06, Chuck Ebbert <[email protected]> wrote:
> On Fri, 24 Mar 2006 09:22:51 -0800, Ray Lee wrote:
> > On 3/24/06, Andreas Mohr <[email protected]> wrote:
> > > + loops += bclock;
> > [...]
> > > - } while ((now-bclock) < loops);
> > > + } while (now < loops);
> >
> > Erm, aren't you introducing an overflow problem here?
> >
> > if loops is 2^32-1, bclock is 1, the old version would execute the
> > proper number of times, the new one will blow out in one tick.
>
> Yes, but the old version has a bug too.
[...]
> If (loops == 100000) and (bclock == 2^32-1) the loop will terminate
> immediately when the low part of the TSC overflows because (now-bclock)
> is a large number.

Er, no, it won't, because (now-bclock) won't be large.

I know thinking about math on a modulo number line such as u8/16/32 is
odd, but it's best if you just always think of "subtraction" to mean
"distance between." (Which is always true in any space or coordinate
system, even with wrap arounds.) This is the same trick used by
Andrew's ring buffers, where you let head and tail wrap around freely,
and only perform the modulo operation at dereferencing.

A simple test program will give you a better feel for what's going on
(I write a lot of these...):

#include <stdio.h>
int main() {
unsigned int a,b,c;
a=-1-1;
b=1000;
c=b-a;
printf("%u - %u = %u\n", b, a, c);
}

ray@issola:~/work/test/overflow$ gcc -o test test.c
ray@issola:~/work/test/overflow$ ./test
1000 - 4294967294 = 1002

So, it wraps appropriately, as odd as that may seem at first blush.

Ray