2000-11-07 03:20:02

by Frank Davis

[permalink] [raw]
Subject: Pentium 4 and 2.4/2.5

Hello,
I noticed that Pentium 4 isn't an config option in 2.4.0-test10. Is
someone working on a patch for the the kernel (if needed) to support the
Pentium 4 after 2.4.0 is released?

Regards,
Frank


2000-11-07 03:41:46

by Andre Hedrick

[permalink] [raw]
Subject: Re: Pentium 4 and 2.4/2.5


Not to worry, some of us are working with the 'I' guys to do proper P4
detection.

Cheers,

On Sat, 4 Nov 2000, Frank Davis wrote:

> Hello,
> I noticed that Pentium 4 isn't an config option in 2.4.0-test10. Is
> someone working on a patch for the the kernel (if needed) to support the
> Pentium 4 after 2.4.0 is released?
>
> Regards,
> Frank
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/
>

Andre Hedrick
CTO Timpanogas Research Group
EVP Linux Development, TRG
Linux ATA Development

2000-11-07 04:02:11

by Robert Love

[permalink] [raw]
Subject: Re: Pentium 4 and 2.4/2.5

On Sat, 4 Nov 2000, Frank Davis hissed:
> I noticed that Pentium 4 isn't an config option in 2.4.0-test10. Is
> someone working on a patch for the the kernel (if needed) to support the
> Pentium 4 after 2.4.0 is released?

from what i have read of the Pentium IV, the linux kernel will not need
any patches to run successfully.

that being said, a lot of oppurtunity exists for optimization, i bet. some
686-core optimizations may need to be rethought, but there is at least
some things we can better do to take advantage of the P4. if nothing
else, the new pipeline size and cache-

--
Robert M. Love
[email protected]
[email protected]

2000-11-07 12:04:16

by Alan Cox

[permalink] [raw]
Subject: Re: Pentium 4 and 2.4/2.5

> I noticed that Pentium 4 isn't an config option in 2.4.0-test10. Is
> someone working on a patch for the the kernel (if needed) to support the
> Pentium 4 after 2.4.0 is released?

And also for 2.2. 2.2.18pre18/19 should ident the CPU fine. A contributed patch
should also report the caches correctly in 2.2.18pre20 once I release it.

The big 2.4 issue is that 2.4 won't work with a CPU running at 2GHz or higher
(2.2.18 will be the first 2.2 kernel handling this). The changes have yet to be
pushed into 2.4. Thus judging by Intels noises so far it will only be early
PIV processors that work ;)

Alan

2000-11-07 12:13:36

by Alan Cox

[permalink] [raw]
Subject: Re: Pentium 4 and 2.4/2.5

> Not to worry, some of us are working with the 'I' guys to do proper P4
> detection.

Be careful with the intel patches. The ones I've seen so far tried to call the
cpu 'if86' breaking several tools that do cpu model checking off uname. They
didnt fix the 2GHz CPU limit, they use 'rep nop' in the locks which is
explicitly 'undefined behaviour' for non intel processors and they use the
TSC without checking it had one.

Hopefully they have improved since

Alan

2000-11-07 21:07:18

by Lyle Coder

[permalink] [raw]
Subject: Re: Pentium 4 and 2.4/2.5

Alan,
are you saying that rep;nop is not needed in the spinlocks? (because they
are for P4)

Thanks
Lyle
----- Original Message -----
From: "Alan Cox" <[email protected]>
To: "Andre Hedrick" <[email protected]>
Cc: "Frank Davis" <[email protected]>; <[email protected]>
Sent: Tuesday, November 07, 2000 4:13 AM
Subject: Re: Pentium 4 and 2.4/2.5


> > Not to worry, some of us are working with the 'I' guys to do proper P4
> > detection.
>
> Be careful with the intel patches. The ones I've seen so far tried to call
the
> cpu 'if86' breaking several tools that do cpu model checking off uname.
They
> didnt fix the 2GHz CPU limit, they use 'rep nop' in the locks which is
> explicitly 'undefined behaviour' for non intel processors and they use the
> TSC without checking it had one.
>
> Hopefully they have improved since
>
> Alan
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/
>

2000-11-07 21:48:34

by Alan Cox

[permalink] [raw]
Subject: Re: Pentium 4 and 2.4/2.5

> are you saying that rep;nop is not needed in the spinlocks? (because they
> are for P4)

rep;nop is a magic instruction on the PIV and possibly some PIII series CPUs
[not sure]. As far as I can make out it naps momentarily or until bus
activity thus saving power on spinlocks.

The problem is 'rep nop' is not defined on other cpus so we can only really use
it on the PIII/PIV kernel builds

2000-11-07 23:04:17

by Frank Davis

[permalink] [raw]
Subject: Re: Pentium 4 and 2.4/2.5

Alan,
As for 'rep nop', couldn't we add in the code, as an example:
#ifdef Pentium_4
rep nop
#endif

As for the 2.2.18 patch for correctly determining 2GHz and above, can
it be easily merged into the 2.4.x kernel, and if so, what's the maximum
clock speed that can be detected?

Regards,
-Frank

On Tue, 7 Nov 2000 21:48:40 +0000 (GMT) Alan Cox
<[email protected]> writes:
> > are you saying that rep;nop is not needed in the spinlocks?
> (because they
> > are for P4)
>
> rep;nop is a magic instruction on the PIV and possibly some PIII
> series CPUs
> [not sure]. As far as I can make out it naps momentarily or until
> bus
> activity thus saving power on spinlocks.
>
> The problem is 'rep nop' is not defined on other cpus so we can only
> really use
> it on the PIII/PIV kernel builds
>

2000-11-08 00:43:04

by Alan Cox

[permalink] [raw]
Subject: Re: Pentium 4 and 2.4/2.5

> As for the 2.2.18 patch for correctly determining 2GHz and above, can
> it be easily merged into the 2.4.x kernel, and if so, what's the maximum
> clock speed that can be detected?

It should be easy yes. Its good to 100Ghz or so now ;)

2000-11-08 17:27:08

by Linus Torvalds

[permalink] [raw]
Subject: Re: Pentium 4 and 2.4/2.5

In article <[email protected]>,
Alan Cox <[email protected]> wrote:
>
>Be careful with the intel patches. The ones I've seen so far tried to call the
>cpu 'if86' breaking several tools that do cpu model checking off uname. They
>didnt fix the 2GHz CPU limit, they use 'rep nop' in the locks which is
>explicitly 'undefined behaviour' for non intel processors and they use the
>TSC without checking it had one.

"rep nop" is definitely not undefined behaviour except in some older
Intel manuals.

Do you actually know of a CPU where it doesn't work? Every single
intel-compatible CPU I know of has the rep prefixes as no-ops if they
aren't used (lock -> ILL being a later, documented, addition), and the
way the prefixes work it almost has to be that way.

As prefixes they can't be part of the instruction, because you can
legally have other prefixes in between the rep and the real instruction,
which means that any sane implementation will just set a flag when it
sees the prefix, and an instruction that doesn't care will just ignore
the flag. So you'd almost have to do _extra_ work to make "rep nop"
fail, even if it used to be specified as "undefined".

Standard 2.4.x will definitely be using "rep nop" unless somebody can
show me a CPU where it doesn't work (and even then I probably won't care
unless that CPU is also SMP-capable). It's documented by intel these
days, and it works on all CPU's I've ever heard of, and it even makes
sense to me (*).

(*) Well.. More sense than _some_ instruction set extensions I've seen.
After all, "repeat no-op" for a longer delay sounds almost logical.
Certainly better than that IV == 15 thing, ugh ;)

Also, at least part of the reason Intel removed the TSC check was that
Linux actually seems to get the extended CPU capability flags wrong,
overwriting the _real_ capability flags which in turn caused the TSC
check on Linux to simply not work. Peter Anvin is working on fixing
this. I suspect that Linux-2.2 has the same problem.

There's a few other minor details that need to be fixed for Pentium 4
features (aka " not very well documented errata"), and I think I have
them all except for waiting for Peter to get the capabilities flag
handling right.

So I suspect that we'll have good support for Pentium IV soon enough..

Linus

2000-11-08 17:32:19

by Linus Torvalds

[permalink] [raw]
Subject: Re: Pentium 4 and 2.4/2.5

In article <[email protected]>,
Alan Cox <[email protected]> wrote:
>
>rep;nop is a magic instruction on the PIV and possibly some PIII series CPUs
>[not sure]. As far as I can make out it naps momentarily or until bus
>activity thus saving power on spinlocks.

>From what I've heard, the reason Intel _really_ wants "rep nop" is that
without it the CPU will heat up quite efficiently (that's what you do
when you want to run at an eventual 2GHz with all cylinders firing all
the time), causing thermal meltdown on non-thermally protected CPU's and
CPU speed throttling on the ones that _are_ thermally protected (which
will obviously have to be all the shipping ones).

And the thermal throttling will severly cripple performance.

>The problem is 'rep nop' is not defined on other cpus so we can only really use
>it on the PIII/PIV kernel builds

Intel retroactively defined it for all their CPU's. And I very strongly
suspect that every single other x86 CPU vendor does the same. Why not?
They get a new instruction for free, but just documenting it. Maybe they
can sell the same old chip with a new name ("The Xxxxx Wonderchip. Now
with documetned 'rep nop' support! Get one today!").

Linus

2000-11-08 17:50:20

by Alan Cox

[permalink] [raw]
Subject: Re: Pentium 4 and 2.4/2.5

> unless that CPU is also SMP-capable). It's documented by intel these
> days, and it works on all CPU's I've ever heard of, and it even makes
> sense to me (*).

Do the intel docs guarantee it works on i486 and higher, if so SMP athlon
will be the only check needed for the SMP users. You work for an x86 chip
cloning company so if you say it works I trust you 8)

> Also, at least part of the reason Intel removed the TSC check was that
> Linux actually seems to get the extended CPU capability flags wrong,
> overwriting the _real_ capability flags which in turn caused the TSC
> check on Linux to simply not work. Peter Anvin is working on fixing
> this. I suspect that Linux-2.2 has the same problem.

I've not seen incorrect TSC detection in 2.2, do you know the precise
circumstances this occurs and I'll check over them. I've also got no
bug reports of this failing.

check_config would also panic with the 'Kernel compiled for ..' message
if it occurred.

> There's a few other minor details that need to be fixed for Pentium 4
> features (aka " not very well documented errata"), and I think I have
> them all except for waiting for Peter to get the capabilities flag
> handling right.
>
> So I suspect that we'll have good support for Pentium IV soon enough..

Excellent

2000-11-08 18:11:32

by Linus Torvalds

[permalink] [raw]
Subject: Re: Pentium 4 and 2.4/2.5



On Wed, 8 Nov 2000, Alan Cox wrote:
> > unless that CPU is also SMP-capable). It's documented by intel these
> > days, and it works on all CPU's I've ever heard of, and it even makes
> > sense to me (*).
>
> Do the intel docs guarantee it works on i486 and higher, if so SMP athlon
> will be the only check needed for the SMP users. You work for an x86 chip
> cloning company so if you say it works I trust you 8)

Well, we don't make low-power SMP laptops, so as such Transmeta doesn't
much care. It will work, though. And yes, as far as I know Intel made it
an "architecture feature", meaning that they claim it work son all their
ia32 chips.

Now, I could imagine that Intel would select an instruction that didn't
work on Athlon on purpose, but I really don't think they did. I don't
have an athlon to test.

It's easy enough to generate a test-program. If the following works,
you're pretty much guaranteed that it's ok

int main()
{
printf("Testing 'rep nop' ... ");
asm volatile("rep ; nop");
printf("okey-dokey\n");
return 0;
}

(there's not much a "rep nop" _can_ do, after all - the most likely CPU
extension would be to raise an "Illegal Opcode" fault).

> > Also, at least part of the reason Intel removed the TSC check was that
> > Linux actually seems to get the extended CPU capability flags wrong,
> > overwriting the _real_ capability flags which in turn caused the TSC
> > check on Linux to simply not work. Peter Anvin is working on fixing
> > this. I suspect that Linux-2.2 has the same problem.
>
> I've not seen incorrect TSC detection in 2.2, do you know the precise
> circumstances this occurs and I'll check over them. I've also got no
> bug reports of this failing.

It won't fail on other CPU's. The bug is, as far as I can tell, in
get_model_name(),

cpuid(0x80000001, &dummy, &dummy, &dummy, &(c->x86_capability));

Notice how we overwrite the x86_capability state with whatever we read
from the extended register 0x80000001. So we overwrite the _real_
capabilities that we got the right way in head.S.

This is wrong. It just happens to work on other, non-Pentium IV,
processors. The extended capabilities are an _extention_, not replacement,
for the regular capabilities.

> check_config would also panic with the 'Kernel compiled for ..' message
> if it occurred.

Which is what it apparently does, if you compile for TSC. Even though very
obviously a Pentium IV _does_ have a TSC.

NOTE! I don't actually have access to a Pentium IV myself yet, although
I'm promised one soon enough. So I've only got second-hand reports on the
cpuid thing so far.

Linus

2000-11-08 18:17:02

by Alan Cox

[permalink] [raw]
Subject: Re: Pentium 4 and 2.4/2.5

> It won't fail on other CPU's. The bug is, as far as I can tell, in
> get_model_name(),
>
> cpuid(0x80000001, &dummy, &dummy, &dummy, &(c->x86_capability));

Dave Jones fixed this one - for intel we don't use get_model_name() blindly
now. I can see how some earlier 2.2.18pre's would have blown up, but 2.2.17
would (fortunately) be ok.

Thanks

> Notice how we overwrite the x86_capability state with whatever we read
> from the extended register 0x80000001. So we overwrite the _real_
> capabilities that we got the right way in head.S.

Yep

Alan

2000-11-08 18:18:02

by Brian Pomerantz

[permalink] [raw]
Subject: Re: Pentium 4 and 2.4/2.5

On Wed, Nov 08, 2000 at 10:10:45AM -0800, Linus Torvalds wrote:
>
> Now, I could imagine that Intel would select an instruction that didn't
> work on Athlon on purpose, but I really don't think they did. I don't
> have an athlon to test.
>
> It's easy enough to generate a test-program. If the following works,
> you're pretty much guaranteed that it's ok
>
> int main()
> {
> printf("Testing 'rep nop' ... ");
> asm volatile("rep ; nop");
> printf("okey-dokey\n");
> return 0;
> }
>
> (there's not much a "rep nop" _can_ do, after all - the most likely CPU
> extension would be to raise an "Illegal Opcode" fault).
>

Just for the curious, this works on Athlons. :)


BAPper

2000-11-08 18:22:22

by Alan Cox

[permalink] [raw]
Subject: Re: Pentium 4 and 2.4/2.5

> > asm volatile("rep ; nop");
> >
> > (there's not much a "rep nop" _can_ do, after all - the most likely CPU
> > extension would be to raise an "Illegal Opcode" fault).
>
> Just for the curious, this works on Athlons. :)

What state does it leave the condition codes ? That matters.

Take for example

if (!oldval)
asm volatile(
"2:"
"cmpl $-1, %0;"
"rep; nop;"
"je 2b;"
: :"m" (current->need_resched));
}

When running SMP with poll_idle enabled. I can't see it changing condition
codes on an athlon but..

2000-11-08 18:29:33

by Benjamin LaHaise

[permalink] [raw]
Subject: Re: Pentium 4 and 2.4/2.5

On Wed, 8 Nov 2000, Alan Cox wrote:

> What state does it leave the condition codes ? That matters.

Alan, rep ; nop is one of the suggested 2 byte fillers in the Athon
optimization guide; it's handled during instruction decode and is
completely free. It also has no effect on K6s.

-ben

2000-11-08 18:34:24

by Brian Pomerantz

[permalink] [raw]
Subject: Re: Pentium 4 and 2.4/2.5

On Wed, Nov 08, 2000 at 06:21:54PM +0000, Alan Cox wrote:
> > > asm volatile("rep ; nop");
> > >
> > > (there's not much a "rep nop" _can_ do, after all - the most likely CPU
> > > extension would be to raise an "Illegal Opcode" fault).
> >
> > Just for the curious, this works on Athlons. :)
>
> What state does it leave the condition codes ? That matters.
>
> Take for example
>
> if (!oldval)
> asm volatile(
> "2:"
> "cmpl $-1, %0;"
> "rep; nop;"
> "je 2b;"
> : :"m" (current->need_resched));
> }
>
> When running SMP with poll_idle enabled. I can't see it changing condition
> codes on an athlon but..

Yup, that works as well. This example:

int foo = -1;
asm volatile(
"2:"
"cmpl $-1, %0;"
"rep; nop;"
"je 2b;"
: :"m" (foo));

loops forever. If you set 'foo = 0' it drops out.


BAPper

2000-11-08 18:47:58

by Alan Cox

[permalink] [raw]
Subject: Re: Pentium 4 and 2.4/2.5

> On Wed, 8 Nov 2000, Alan Cox wrote:
>
> > What state does it leave the condition codes ? That matters.
>
> Alan, rep ; nop is one of the suggested 2 byte fillers in the Athon
> optimization guide; it's handled during instruction decode and is
> completely free. It also has no effect on K6s.

Ok. Issue settled. So 'rep nop' is safe. Ok that can get into the spinlocks
for 2.2.18


2000-11-09 20:43:14

by Simon Kirby

[permalink] [raw]
Subject: Re: Pentium 4 and 2.4/2.5

On Wed, Nov 08, 2000 at 06:47:40PM +0000, Alan Cox wrote:

> Ok. Issue settled. So 'rep nop' is safe. Ok that can get into the spinlocks
> for 2.2.18

Just curious... What does "rep nop" actually accomplish, anyway?

Simon-

[ Stormix Technologies Inc. ][ NetNation Communications Inc. ]
[ [email protected] ][ [email protected] ]
[ Opinions expressed are not necessarily those of my employers. ]