This patch fixes an #endif misplacement, which leads to dead code in
sched_clock() in arch/i386/kernel/timers/timer_tsc.c, due to a return
outside the ifdef/endif.
Please consider applying, as sched_clock() apparently does not behave as
expected. Patched against 2.6.0-test11.
Regards.
--
Ricardo Nabinger Sanchez
GNU/Linux #140696 [http://counter.li.org]
Slackware Linux
Warning:
Trespassers will be shot.
Survivors will be shot again.
diff -urN linux-2.6.0-test11/arch/i386/kernel/timers/timer_tsc.c
linux-2.6.0-test11-sched_clock/arch/i386/kernel/timers/timer_tsc.c
--- linux-2.6.0-test11/arch/i386/kernel/timers/timer_tsc.c 2003-11-26 18:44:45.000000000 -0200
+++ linux-2.6.0-test11-sched_clock/arch/i386/kernel/timers/timer_tsc.c 2003-11-28 12:58:59.000000000 -0200
@@ -140,8 +140,8 @@
*/
#ifndef CONFIG_NUMA
if (!use_tsc)
-#endif
return (unsigned long long)jiffies * (1000000000 / HZ);
+#endif
/* Read the Time Stamp Counter */
rdtscll(this_offset);
Hi Tim,
Tim Schmielau wrote:
> No, this is exactly what is intended: don't use the TSC on NUMA, use
> jiffies instead.
The patch didn't hurt this.
> Look at the comment just above those lines.
The patch doesn't uses jiffies indiscriminately: Only if we're on a
NUMA system with !use_tsc.
Otherwise (on x86 SMP, for example) we use rdtsc...which seems The
Right Thing(tm). Hece move the #endif a bit down.
Cheers
Felipe
On Fri, 28 Nov 2003, Ricardo Nabinger Sanchez wrote:
> This patch fixes an #endif misplacement, which leads to dead code in
> sched_clock() in arch/i386/kernel/timers/timer_tsc.c, due to a return
> outside the ifdef/endif.
No, this is exactly what is intended: don't use the TSC on NUMA, use
jiffies instead.
Look at the comment just above those lines.
Tim
> --- linux-2.6.0-test11/arch/i386/kernel/timers/timer_tsc.c 2003-11-26 18:44:45.000000000 -0200
> +++ linux-2.6.0-test11-sched_clock/arch/i386/kernel/timers/timer_tsc.c 2003-11-28 12:58:59.000000000 -0200
> @@ -140,8 +140,8 @@
> */
> #ifndef CONFIG_NUMA
> if (!use_tsc)
> -#endif
> return (unsigned long long)jiffies * (1000000000 / HZ);
> +#endif
>
> /* Read the Time Stamp Counter */
> rdtscll(this_offset);
Hi,
Actually, the original code seems to be ok:
#ifndef CONFIG_NUMA
if (!use_tsc)
#endif
return (unsigned long long)jiffies * (1000000000 / HZ);
That is: on x86 we'll get into "#ifndef CONFIG_NUMA", and "if (!use_tsc)" will
be called, which should be the expected behaviour.
Maybe in your case the use_tsc flag was being set to 0 (bug in detection code
/ unsupported feature by the processor?), leading your code to ignore the TSC
and return the current time based on the current jiffies instead.
Lucas
On Friday 28 November 2003 14:39, Felipe W Damasio wrote:
> Hi Tim,
>
> Tim Schmielau wrote:
> > No, this is exactly what is intended: don't use the TSC on NUMA, use
> > jiffies instead.
>
> The patch didn't hurt this.
>
> > Look at the comment just above those lines.
>
> The patch doesn't uses jiffies indiscriminately: Only if we're on a
> NUMA system with !use_tsc.
>
> Otherwise (on x86 SMP, for example) we use rdtsc...which seems The
> Right Thing(tm). Hece move the #endif a bit down.
>
> Cheers
>
> Felipe
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
Hi Nick,
Nick Piggin wrote:
> The ifdef isn't pretty, but its performance critical code, its easy to
> understand, and there is a big comment above it. I think its OK the
> way it is. Not that you would ever notice any difference probably.
You're right. As Lucas already pointed out, the ifdef CONFIG_NUMA is
actually an ifndef...
Like myself, I think Ricardo overlooked this :)
Cheers,
Felipe
Felipe W Damasio wrote:
> Hi Tim,
>
> Tim Schmielau wrote:
>
>> No, this is exactly what is intended: don't use the TSC on NUMA, use
>> jiffies instead.
>
>
> The patch didn't hurt this.
>
>> Look at the comment just above those lines.
>
>
> The patch doesn't uses jiffies indiscriminately: Only if we're on
> a NUMA system with !use_tsc.
>
> Otherwise (on x86 SMP, for example) we use rdtsc...which seems The
> Right Thing(tm). Hece move the #endif a bit down.
The ifdef isn't pretty, but its performance critical code, its easy to
understand, and there is a big comment above it. I think its OK the
way it is. Not that you would ever notice any difference probably.
> On Fri, 28 Nov 2003, Ricardo Nabinger Sanchez wrote:
>
> > This patch fixes an #endif misplacement, which leads to dead code in
> > sched_clock() in arch/i386/kernel/timers/timer_tsc.c, due to a return
> > outside the ifdef/endif.
>
> No, this is exactly what is intended: don't use the TSC on NUMA, use
> jiffies instead.
> Look at the comment just above those lines.
HOWEVER, we seem to have some jiffies wrap bugs in this file.
Tim
--- linux-2.6.0-test11/arch/i386/kernel/timers/timer_tsc.c.orig 2003-11-28 17:50:49.000000000 +0100
+++ linux-2.6.0-test11/arch/i386/kernel/timers/timer_tsc.c 2003-11-28 18:18:44.000000000 +0100
@@ -30,7 +30,6 @@
int tsc_disable __initdata = 0;
extern spinlock_t i8253_lock;
-extern volatile unsigned long jiffies;
static int use_tsc;
/* Number of usecs that the last interrupt was delayed */
@@ -141,7 +140,7 @@
#ifndef CONFIG_NUMA
if (!use_tsc)
#endif
- return (unsigned long long)jiffies * (1000000000 / HZ);
+ return (unsigned long long)get_jiffies_64() * (1000000000 / HZ);
/* Read the Time Stamp Counter */
rdtscll(this_offset);
@@ -215,7 +214,8 @@
lost = delta/(1000000/HZ);
delay = delta%(1000000/HZ);
if (lost >= 2) {
- jiffies += lost-1;
+ /* only called under xtime_lock */
+ jiffies_64 += lost-1;
/* sanity check to ensure we're not always losing ticks */
if (lost_count++ > 100) {
@@ -241,7 +241,7 @@
* usec delta is > 90% # of usecs/tick)
*/
if (lost && abs(delay - delay_at_last_interrupt) > (900000/HZ))
- jiffies++;
+ jiffies_64++;
}
static void delay_tsc(unsigned long loops)
@@ -283,7 +283,8 @@
offset = hpet_readl(HPET_T0_CMP) - hpet_tick;
if (unlikely(((offset - hpet_last) > hpet_tick) && (hpet_last != 0))) {
int lost_ticks = (offset - hpet_last) / hpet_tick;
- jiffies += lost_ticks;
+ /* only called under xtime_lock */
+ jiffies_64 += lost_ticks;
}
hpet_last = hpet_current;
> HOWEVER, we seem to have some jiffies wrap bugs in this file.
Should have checked the whole directory first, the chance of cut'n paste
bugs was just too high...
Tim
--- linux-2.6.0-test11/arch/i386/kernel/timers/timer_hpet.c.orig 2003-11-28 19:21:32.000000000 +0100
+++ linux-2.6.0-test11/arch/i386/kernel/timers/timer_hpet.c 2003-11-28 19:22:28.000000000 +0100
@@ -108,7 +108,8 @@
offset = hpet_readl(HPET_T0_CMP) - hpet_tick;
if (unlikely(((offset - hpet_last) > hpet_tick) && (hpet_last != 0))) {
int lost_ticks = (offset - hpet_last) / hpet_tick;
- jiffies += lost_ticks;
+ /* only called under xtime_lock */
+ jiffies_64 += lost_ticks;
}
hpet_last = offset;
Quoting Felipe W Damasio <[email protected]>
Sent on Fri, 28 Nov 2003 15:12:20 -0200
> Hi Nick,
>
> Nick Piggin wrote:
> > The ifdef isn't pretty, but its performance critical code, its easy to
> > understand, and there is a big comment above it. I think its OK the
> > way it is. Not that you would ever notice any difference probably.
>
> You're right. As Lucas already pointed out, the ifdef CONFIG_NUMA is
> actually an ifndef...
> Like myself, I think Ricardo overlooked this :)
Oops! Completely overlooked this. Thanks for correcting me :)
--
Ricardo Nabinger Sanchez
GNU/Linux #140696 [http://counter.li.org]
Slackware Linux
Warning:
Trespassers will be shot.
Survivors will be shot again.
Quoting Tim Schmielau <[email protected]>
Sent on Fri, 28 Nov 2003 17:34:49 +0100 (CET)
> > This patch fixes an #endif misplacement, which leads to dead code in
> > sched_clock() in arch/i386/kernel/timers/timer_tsc.c, due to a return
> > outside the ifdef/endif.
>
> No, this is exactly what is intended: don't use the TSC on NUMA, use
> jiffies instead.
> Look at the comment just above those lines.
I'm breaking things. Sorry.
I think I understood it now: the #ifndef protects only the check for TSC
availability on non-NUMA archs. If it's available, and not under NUMA (so
the ifndef), use it (jump to the rdtscll()), otherwise return the expression
result.
The strange thing to me is that I'm getting 1/10 of the expected value when
measuring tasks timeslices. Instead of getting ~100ms for tasks which just
burn CPU, I'm getting 10ms.
I measure timeslices inside schedule(), updating the average timeslice for
the leaving process, using (now - prev->timestamp).
Any clue of what am I doing wrong?
Regards.
unsigned long long sched_clock(void)
{
unsigned long long this_offset;
/*
* In the NUMA case we dont use the TSC as they are not
* synchronized across all CPUs.
*/
#ifndef CONFIG_NUMA
if (!use_tsc)
#endif
return (unsigned long long)jiffies * (1000000000 / HZ);
/* Read the Time Stamp Counter */
rdtscll(this_offset);
/* return the value in ns */
return cycles_2_ns(this_offset);
}
--
Ricardo Nabinger Sanchez
GNU/Linux #140696 [http://counter.li.org]
Slackware Linux
Warning:
Trespassers will be shot.
Survivors will be shot again.
On Friday 28 November 2003 19:35, Ricardo Nabinger Sanchez wrote:
> Quoting Tim Schmielau <[email protected]>
> Sent on Fri, 28 Nov 2003 17:34:49 +0100 (CET)
>
> > > This patch fixes an #endif misplacement, which leads to dead code in
> > > sched_clock() in arch/i386/kernel/timers/timer_tsc.c, due to a return
> > > outside the ifdef/endif.
> >
> > No, this is exactly what is intended: don't use the TSC on NUMA, use
> > jiffies instead.
> > Look at the comment just above those lines.
>
> I'm breaking things. Sorry.
>
> I think I understood it now: the #ifndef protects only the check for TSC
> availability on non-NUMA archs. If it's available, and not under NUMA (so
> the ifndef), use it (jump to the rdtscll()), otherwise return the
> expression result.
>
> The strange thing to me is that I'm getting 1/10 of the expected value when
> measuring tasks timeslices. Instead of getting ~100ms for tasks which just
> burn CPU, I'm getting 10ms.
>
> I measure timeslices inside schedule(), updating the average timeslice for
> the leaving process, using (now - prev->timestamp).
>
> Any clue of what am I doing wrong?
Hi,
It could it be possible that TSC isn't being enabled on your processor due to
a buggy TSC. You can try to trace the code on init_tsc(), in
arch/i386/kernel/timers/timer_tsc.c to realize what's going on.
You can also try to check whether you're using HPET or not (when enabled and
supported it makes use of TSC) in your .config . In the case you're not using
it, the code that calibrates and dictates how to deal with the stamp counter
is calibrate_tsc(), in arch/i386/kernel/timers/common.c.
Hope this helps,
Lucas