Hey Marcelo,
I know its probably poor form to send this out so close to -rc, but I
figured I might as well give it a shot. I'll happily resubmit this for
the .20pre series later if you'd prefer.
This patch disables the TSCs when compiled for Multiquad NUMA hardware.
Due to the slower interconnect, the TSCs aren't being synced properly at
boot time. Even if they were synced, since the different nodes are
driven by different crystals, the TSCs still drift.
This results in sequential calls to gettimeofday to return
non-sequential time values. By disabling the TSCs on these boxes, it
forces gettimeofday to use the PIC clock instead, fixing the problem.
Let me know if you have any comments.
Thanks
-john
Hi Ben,
On Fri, Jun 14, 2002 at 02:53:07PM -0400, Benjamin LaHaise wrote:
> On Fri, Jun 14, 2002 at 11:35:26AM -0700, john stultz wrote:
> > This results in sequential calls to gettimeofday to return
> > non-sequential time values. By disabling the TSCs on these boxes, it
> > forces gettimeofday to use the PIC clock instead, fixing the problem.
>
> This seems to be yet another reason for supporting per-CPU TSC
> calibration, as that would fix machines with different speed cpus, too.
I agree.
Maybe the patch I once made to support CPUs with different speeds can serve
as a starting point?
http://www.uwsg.iu.edu/hypermail/linux/kernel/0203.1/0481.html
However, one would need to make sure that all CPUs occasionally do receive
timer interrupts, otherwise your TSC may overflow. Depending on your
hardware (APICs), this might be an issue. I've been told that Fosters do
misbehave ...
Regards,
--
Kurt Garloff <[email protected]> Eindhoven, NL
GPG key: See mail header, key servers Linux kernel development
SuSE Linux AG, Nuernberg, DE SCSI, Security
On Mon, 2002-06-17 at 17:48, Kurt Garloff wrote:
> Hi Ben,
>
> On Fri, Jun 14, 2002 at 02:53:07PM -0400, Benjamin LaHaise wrote:
> > On Fri, Jun 14, 2002 at 11:35:26AM -0700, john stultz wrote:
> > > This results in sequential calls to gettimeofday to return
> > > non-sequential time values. By disabling the TSCs on these boxes, it
> > > forces gettimeofday to use the PIC clock instead, fixing the problem.
> >
> > This seems to be yet another reason for supporting per-CPU TSC
> > calibration, as that would fix machines with different speed cpus, too.
>
> I agree.
> Maybe the patch I once made to support CPUs with different speeds can serve
> as a starting point?
>
> http://www.uwsg.iu.edu/hypermail/linux/kernel/0203.1/0481.html
>
> However, one would need to make sure that all CPUs occasionally do receive
> timer interrupts, otherwise your TSC may overflow. Depending on your
> hardware (APICs), this might be an issue. I've been told that Fosters do
> misbehave ...
Hmmm, I've skimmed your patch before, but really only just for the
lowest common features bit. I didn't quite grasp it last time, but I
really like what you're doing there. TSC overflow can be almost
eliminated if we use the full 64bits, but if we can round robin the
interrupts, that would help compensate for any skew in the clocks.
Thanks for the pointer!
-john
On Sat, 15 Jun 2002, Mikael Pettersson wrote:
> I disagree with Alan's recommendation.
So do I.
> The real problem is that the kernel confuses a CPU-level property
> (do the CPUs have TSCs?) with a system-level property (are the
> TSCs present and in sync?). CONFIG_X86_TSC really describes the
> latter property, for the former we have the cpu_has_tsc() macro.
Well, CONFIG_X86_TSC simply asserts we have TSCs present and in sync and
cpu_has_tsc is a run-time check for the same. The X86_FEATURE_TSC bit
shouldn't be set (and e.g. "notsc" takes care of this) unless TSCs work
correctly as it's both used internally and exported to the userland. For
low-level fiddling with TSCs one can use cpuid either directly or with the
cpuid driver.
--
+ Maciej W. Rozycki, Technical University of Gdansk, Poland +
+--------------------------------------------------------------+
+ e-mail: [email protected], PGP key available +
Hi!
> Hey Marcelo,
> I know its probably poor form to send this out so close to -rc, but I
> figured I might as well give it a shot. I'll happily resubmit this for
> the .20pre series later if you'd prefer.
Why so gly #ifdefs? We can disable it runtime already, you should need
just *one* ifdef.
Pavel
--
(about SSSCA) "I don't say this lightly. However, I really think that the U.S.
no longer is classifiable as a democracy, but rather as a plutocracy." --hpa
On Fri, Jun 14, 2002 at 11:35:26AM -0700, john stultz wrote:
> This results in sequential calls to gettimeofday to return
> non-sequential time values. By disabling the TSCs on these boxes, it
> forces gettimeofday to use the PIC clock instead, fixing the problem.
This seems to be yet another reason for supporting per-CPU TSC
calibration, as that would fix machines with different speed cpus, too.
-ben
--
"You will be reincarnated as a toad; and you will be much happier."
On Fri, Jun 14, 2002 at 11:35:26AM -0700, john stultz wrote:
> This patch disables the TSCs when compiled for Multiquad NUMA hardware.
> Due to the slower interconnect, the TSCs aren't being synced properly at
> boot time. Even if they were synced, since the different nodes are
> driven by different crystals, the TSCs still drift.
-#ifndef CONFIG_X86_TSC
+#if !defined(CONFIG_X86_TSC)||defined(CONFIG_TSC_DISABLE)
+#ifdef CONFIG_TSC_DISABLE
+static int tsc_disable __initdata = 1;
+#else /*CONFIG_TSC_DISABLE*/
static int tsc_disable __initdata = 0;
+#endif /*CONFIG_TSC_DISABLE*/
This looks *really horrible*
Why not just unset CONFIG_X86_TSC for those machines ?
Dave
--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs
On Fri, 2002-06-14 at 11:57, Dave Jones wrote:
> -#ifndef CONFIG_X86_TSC
> +#if !defined(CONFIG_X86_TSC)||defined(CONFIG_TSC_DISABLE)
> +#ifdef CONFIG_TSC_DISABLE
> +static int tsc_disable __initdata = 1;
> +#else /*CONFIG_TSC_DISABLE*/
> static int tsc_disable __initdata = 0;
> +#endif /*CONFIG_TSC_DISABLE*/
>
> This looks *really horrible*
True, I agree here.
> Why not just unset CONFIG_X86_TSC for those machines ?
I was actually hoping for a suggestion like this when I posted this
patch earlier. Can one really just unset CONFIG_ options that have
previously been set? I'd actually prefer this, but doing so generated a
.config that looked like:
CONFIG_X86_TSC=y
...
# CONFIG_X86_TSC is not set
So I assumed CONFIG_X86_TSC would still hold. Am I wrong, or is there
another way to do this?
Thanks
-john
On Fri, Jun 14, 2002 at 12:04:18PM -0700, john stultz wrote:
> .config that looked like:
>
> CONFIG_X86_TSC=y
> ...
> # CONFIG_X86_TSC is not set
> So I assumed CONFIG_X86_TSC would still hold. Am I wrong, or is there
> another way to do this?
Ugh, I hadn't realised the .config generation was so primitive.
That's quite unfortunate. That needs fixing at some point.
Dave
--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs
On Fri, 14 Jun 2002 21:56:54 +0200, Dave Jones wrote:
>On Fri, Jun 14, 2002 at 12:04:18PM -0700, john stultz wrote:
>
> > .config that looked like:
> >
> > CONFIG_X86_TSC=y
> > ...
> > # CONFIG_X86_TSC is not set
> > So I assumed CONFIG_X86_TSC would still hold. Am I wrong, or is there
> > another way to do this?
>
>Ugh, I hadn't realised the .config generation was so primitive.
>That's quite unfortunate. That needs fixing at some point.
Unless my memory is failing me, I believe the simplest approach
is to (1) don't set CONFIG_X86_TSC, and (2) pass "notsc" as a
kernel boot parameter.
CONFIG_X86_TSC means "the machine has working TSC, period".
That's an intensional optimisation.
Without CONFIG_X86_TSC, Linux manages without TSC, but will
detect and use it if it's there.
Finally, the "notsc" kernel parameter is for obscure cases
where the TSC is present, but should not be used for whatever
reason. I guess the present issue qualifies...
/Mikael
On Fri, 2002-06-14 at 14:53, Mikael Pettersson wrote:
> Unless my memory is failing me, I believe the simplest approach
> is to (1) don't set CONFIG_X86_TSC, and (2) pass "notsc" as a
> kernel boot parameter.
Correct, and this patch basically does both of the above.
The problem is that CONFIG_X86_TSC is enabled on PPro and above cpus.
The machines which are having this problem are multi-node P3 or P4
systems. Each cpu has a working TSC, its just that because they are not
synced they should not be used.
So the patch adds a CONFIG_DISABLE_TSC which is then checked where
earlier just CONFIG_X86_TSC was used. Additionally, if
CONFIG_DISABLE_TSC is set, the flag set by "notsc" is also set.
The usage of CONFIG_X86_TSC took me a bit to get my head around
initially, so your clarification is helpful.
Thanks
-john
On Fri, 14 Jun 2002, Dave Jones wrote:
> On Fri, Jun 14, 2002 at 12:04:18PM -0700, john stultz wrote:
>
> > .config that looked like:
> >
> > CONFIG_X86_TSC=y
> > ...
> > # CONFIG_X86_TSC is not set
> > So I assumed CONFIG_X86_TSC would still hold. Am I wrong, or is there
> > another way to do this?
>
> Ugh, I hadn't realised the .config generation was so primitive.
> That's quite unfortunate. That needs fixing at some point.
I suppose you could it rewrite like
...
CONFIG_X86_WANT_TSC=y (or whatever)
...
if [ some_condition ]; then
define_bool CONFIG_X86_TSC n
else
define_bool CONFIG_X86_TSC $CONFIG_X86_WANT_TSC
fi
Not exactly elegant, but it should work ;)
--Kai
On Fri, 2002-06-14 at 16:29, Kai Germaschewski wrote:
> I suppose you could it rewrite like
>
> ...
> CONFIG_X86_WANT_TSC=y (or whatever)
> ...
>
> if [ some_condition ]; then
> define_bool CONFIG_X86_TSC n
> else
> define_bool CONFIG_X86_TSC $CONFIG_X86_WANT_TSC
> fi
>
> Not exactly elegant, but it should work ;)
Yep, my first release was done in a similar fashion, but Alan suggested
the patch take on its current form. There may be cases where we want to
know if we have a TSC even if we don't want to use them.
Thread link:
http://www.uwsg.iu.edu/hypermail/linux/kernel/0205.3/1188.html
Thanks for the suggestion,
-john
On 14 Jun 2002 16:44:30 -0700, john stultz wrote:
>On Fri, 2002-06-14 at 16:29, Kai Germaschewski wrote:
>> I suppose you could it rewrite like
>>
>> ...
>> CONFIG_X86_WANT_TSC=y (or whatever)
>> ...
>>
>> if [ some_condition ]; then
>> define_bool CONFIG_X86_TSC n
>> else
>> define_bool CONFIG_X86_TSC $CONFIG_X86_WANT_TSC
>> fi
>>
>> Not exactly elegant, but it should work ;)
>
>Yep, my first release was done in a similar fashion, but Alan suggested
>the patch take on its current form. There may be cases where we want to
>know if we have a TSC even if we don't want to use them.
>
>Thread link:
>http://www.uwsg.iu.edu/hypermail/linux/kernel/0205.3/1188.html
I disagree with Alan's recommendation.
The real problem is that the kernel confuses a CPU-level property
(do the CPUs have TSCs?) with a system-level property (are the
TSCs present and in sync?). CONFIG_X86_TSC really describes the
latter property, for the former we have the cpu_has_tsc() macro.
IMO, Kai is right and a nicer fix is to change arch/i386/config.in to:
- s/CONFIG_X86_TSC=y/CONFIG_X86_CPU_HAS_TSC=y/
(this one can also be used as an optimisation to avoid runtime
cpu_has_tsc() checks)
- append a rule which derives CONFIG_X86_TSC from CONFIG_X86_CPU_HAS_TSC
and !multiquad
The other patch which adds an anti-CONFIG_X86_TSC to cancel the
first CONFIG_X86_TSC is so horribly hacky...
/Mikael