Subject: [PATCH 2/5] x86: PAT: fix ambiguous paranoia check in pat_init()

Starting with commit 8d4a4300854f3971502e81dacd930704cb88f606 (x86:
cleanup PAT cpu validation) the PAT CPU feature flag is not cleared
anymore. Now the error message

"PAT enabled, but CPU feature cleared"

in pat_init() is misleading.

Furthermore the current code does not check for existence of the PAT
CPU feature flag if a CPU is whitelisted in validate_pat_support.

This patch clears pat_wc_enabled if boot CPU has no PAT feature flag
and adapts the paranoia check.

Signed-off-by: Andreas Herrmann <[email protected]>
---
arch/x86/kernel/cpu/addon_cpuid_features.c | 7 ++++---
arch/x86/mm/pat.c | 11 ++++++-----
2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/cpu/addon_cpuid_features.c b/arch/x86/kernel/cpu/addon_cpuid_features.c
index 0fbd062..2df461f 100644
--- a/arch/x86/kernel/cpu/addon_cpuid_features.c
+++ b/arch/x86/kernel/cpu/addon_cpuid_features.c
@@ -53,6 +53,9 @@ void __cpuinit init_scattered_cpuid_features(struct cpuinfo_x86 *c)
#ifdef CONFIG_X86_PAT
void __cpuinit validate_pat_support(struct cpuinfo_x86 *c)
{
+ if (!cpu_has_pat)
+ pat_disable("PAT not supported by CPU.");
+
switch (c->x86_vendor) {
case X86_VENDOR_INTEL:
if (c->x86 == 0xF || (c->x86 == 6 && c->x86_model >= 15))
@@ -64,8 +67,6 @@ void __cpuinit validate_pat_support(struct cpuinfo_x86 *c)
return;
}

- pat_disable(cpu_has_pat ?
- "PAT disabled. Not yet verified on this CPU type." :
- "PAT not supported by CPU.");
+ pat_disable("PAT disabled. Not yet verified on this CPU type.");
}
#endif
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index e83b770..8f288f8 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -76,14 +76,15 @@ void pat_init(void)
return;

/* Paranoia check. */
- if (!cpu_has_pat) {
- printk(KERN_ERR "PAT enabled, but CPU feature cleared\n");
+ if (!cpu_has_pat && boot_pat_state) {
/*
- * Panic if this happens on the secondary CPU, and we
+ * If this happens we are on a secondary CPU, but
* switched to PAT on the boot CPU. We have no way to
* undo PAT.
- */
- BUG_ON(boot_pat_state);
+ */
+ printk(KERN_ERR "PAT enabled, "
+ "but not supported by secondary CPU\n");
+ BUG();
}

/* Set PWT to Write-Combining. All other bits stay the same */
--
1.5.5.3



2008-06-10 22:55:47

by Rene Herman

[permalink] [raw]
Subject: Re: [PATCH 2/5] x86: PAT: fix ambiguous paranoia check in pat_init()

On 10-06-08 16:05, Andreas Herrmann wrote:

> Starting with commit 8d4a4300854f3971502e81dacd930704cb88f606 (x86:
> cleanup PAT cpu validation) the PAT CPU feature flag is not cleared
> anymore. Now the error message
>
> "PAT enabled, but CPU feature cleared"
>
> in pat_init() is misleading.

No, it's correct. This is to be read as "The PAT feature is enabled in
the kernel but the CPU claims to not have PAT". The message is not a
leftover from the old flag-clearing setup or anything; it was introduced
in the patch that disabled that.

> Furthermore the current code does not check for existence of the PAT
> CPU feature flag if a CPU is whitelisted in validate_pat_support.

Which is okay-ish, or at least by design it seems. The paranoia check in
pat.c will catch this case.

The current setup is okay really. The only thing it still wants is a
commandline whitelist switch but that needs a somewhat different setup
as validate_pat_support() is too early for __setup() or early_param().

Rene.

2008-06-10 23:40:39

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 2/5] x86: PAT: fix ambiguous paranoia check in pat_init()

Rene Herman wrote:
>
> The current setup is okay really. The only thing it still wants is a
> commandline whitelist switch but that needs a somewhat different setup
> as validate_pat_support() is too early for __setup() or early_param().
>

Actually, I think it's time to nuke the whitelist. I don't think it has
any CPUs on it that have the PAT flag enabled, still.

-hpa

Subject: Re: [PATCH 2/5] x86: PAT: fix ambiguous paranoia check in pat_init()

On Tue, Jun 10, 2008 at 04:33:49PM -0700, H. Peter Anvin wrote:
> Rene Herman wrote:
>> The current setup is okay really. The only thing it still wants is a
>> commandline whitelist switch but that needs a somewhat different setup as
>> validate_pat_support() is too early for __setup() or early_param().
>
> Actually, I think it's time to nuke the whitelist. I don't think it has
> any CPUs on it that have the PAT flag enabled, still.

For Intel all family 0xf CPUs, and family 6 CPUs starting with model
15 are whitelisted.

There seem to be other Intel CPUs that advertise PAT support.
See cpuinfo output at http://gentoo-wiki.com/Safe_Cflags
E.g. Pentium M (model 13), Celeron (model 6), Pentium III (model 8).
(Not sure how correct this information is, though)

Turn the white- into a blacklist for those remaining CPUs until they
are verified?


Andreas

Subject: Re: [PATCH 2/5] x86: PAT: fix ambiguous paranoia check in pat_init()

On Wed, Jun 11, 2008 at 12:55:23AM +0200, Rene Herman wrote:
> On 10-06-08 16:05, Andreas Herrmann wrote:
>
>> Starting with commit 8d4a4300854f3971502e81dacd930704cb88f606 (x86:
>> cleanup PAT cpu validation) the PAT CPU feature flag is not cleared
>> anymore. Now the error message
>> "PAT enabled, but CPU feature cleared"
>> in pat_init() is misleading.
>
> No, it's correct. This is to be read as "The PAT feature is enabled in the
> kernel but the CPU claims to not have PAT". The message is not a leftover
> from the old flag-clearing setup or anything; it was introduced in the
> patch that disabled that.

Ok, right you are. It was introduced with the commit that removed the clearing
of the feature flag.

>> Furthermore the current code does not check for existence of the PAT
>> CPU feature flag if a CPU is whitelisted in validate_pat_support.
>
> Which is okay-ish, or at least by design it seems. The paranoia check in
> pat.c will catch this case.

IMHO this case won't be caught in x86/pat branch.
I guess you haven't checked this feature branch?

To verify the code that I've found there I did the following:

As I had no Transmeta or Centaur CPU at hand I just cleared the PAT
flag in the CPU identification code to simulate the case that all CPUs
of a Vendor are whitelisted (even those w/o PAT support). The first
time pat_init() is entered you get

PAT enabled, but CPU feature cleared
(=> which is wrong as no flag was cleared)
x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
(=> which is wrong as PAT shouldn't be enabled on such CPUs)

For the second CPU BUG_ON will come into play as boot_pat_state is set.
But I guess, that will never be the case on such rather old CPUs.

Furthermore when PAT is really not supported you might get an GP when
writing to the PAT MSR.

The older code was just right to exit pat_init() if the boot cpu had
no PAT feature flag. See pat_known_cpu(void) in the old code
(git show 8d4a4300).

> The current setup is okay really.

No, not in x86/pat branch.
(Or have I missed something, need more coffee or what ... ;-)

IMHO the current state is:

We can whitelist all Transmeta, Centaur and AMD CPUs (no errata wrt
PAT are known). So the assumption is that for those CPUs PAT works if
advertised.

Then we have Intel for which all family 0xf CPUs, and family 6 CPUs
starting with model 15 are whitelisted. AFAIK there are other Intel
CPUs that advertise PAT support.
See for instance cpuinfo output at http://gentoo-wiki.com/Safe_Cflags
E.g. Pentium M (model 13), Celeron (model 6), Pentium III (model 8).

Has someone checked for errata regarding PAT for those CPUs? If not,
the whitelist should be kept or at least turned into a blacklist for
the remaining Intel CPUs until this is verified.

> The only thing it still wants is a commandline whitelist switch but
> that needs a somewhat different setup as validate_pat_support() is
> too early for __setup() or early_param().

That should be easy to do. At the latest pat_init() should call
validate_pat_...() (if and only if boot_pat_state==0). Then
introducing some pat=force parameter and the validate function should
ignore the whitelist when this parameter was given and just rely on
the CPU feature flag to activate PAT.


Regards,

Andreas

2008-06-11 12:59:30

by Rene Herman

[permalink] [raw]
Subject: Re: [PATCH 2/5] x86: PAT: fix ambiguous paranoia check in pat_init()

On 11-06-08 11:41, Andreas Herrmann wrote:

> As I had no Transmeta or Centaur CPU at hand I just cleared the PAT
> flag in the CPU identification code to simulate the case that all CPUs
> of a Vendor are whitelisted (even those w/o PAT support). The first
> time pat_init() is entered you get
>
> PAT enabled, but CPU feature cleared
> (=> which is wrong as no flag was cleared)

Again, you are misreading this. Please just replace the message mentally
by "PAT enabled, but CPU claims to not support PAT". "cleared" here does
not signify that we ourselves cleared anything, just that flag IS clear
(unset). Yes, maybe the wording could be better but it's not wrong.

> x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
> (=> which is wrong as PAT shouldn't be enabled on such CPUs)

Again not wrong, or at least by design. Thomas Gleixner did it this way
and with that "paranoia check" explicitly bombing out only for SMP this
wouldn't have been by accident. He no doubt knows why he did so (and
he's in CC so if we're real lucky we might also now...)

> IMHO the current state is:
>
> We can whitelist all Transmeta, Centaur and AMD CPUs (no errata wrt
> PAT are known). So the assumption is that for those CPUs PAT works if
> advertised.
>
> Then we have Intel for which all family 0xf CPUs, and family 6 CPUs
> starting with model 15 are whitelisted. AFAIK there are other Intel
> CPUs that advertise PAT support.

Definitely. Even my P2 (family 6, model 5) does. The whitelist was
discussed to be a temporary measure in an attempt to for now not muddy
testing of the feature with reports from CPUs in which the feature
designers weren't particularly interested. It should be turned into a
blacklist at a relatively nearby point after which also no whitelisting
command line switches are needed anymore.

I have myself been applying a patch to whitelist all of AMD family 6 but
with things just temporary this is no big deal...

Rene.

Subject: [PATCH] x86: enable PAT on (almost) all CPUs that advertise it

On Wed, Jun 11, 2008 at 11:47:30AM +0200, Andreas Herrmann wrote:
> There seem to be other Intel CPUs that advertise PAT support.
> See cpuinfo output at http://gentoo-wiki.com/Safe_Cflags
> E.g. Pentium M (model 13), Celeron (model 6), Pentium III (model 8).
> (Not sure how correct this information is, though)
>
> Turn the white- into a blacklist for those remaining CPUs until they
> are verified?

How about the attached patch.
This one should replace patches 1 and 2 that I've sent before.


Regards,

Andreas
--

x86: PAT: enable PAT on (almost) all CPUs that advertise it

Dave Jones recently added Centaur and Transmeta CPUs to the
PAT whitelist. See this thread
http://marc.info/?l=linux-kernel&m=121125666327279

I've checked that for AMD CPUs there are no known errata regarding PAT
-- if the CPU advertises it it should work.

I am not aware of other CPUs which claim to support PAT and are not
verified (besides some older Intel CPU models).

So it's time to enable PAT on CPUs that claim to support it.

I just kept the "whitelist" for Intel CPUs. It is similar to what was
submitted by Venkatesh in the first place.
See pat_known_cpu() in commit 2e5d9c857d4e6c9e7b7d8c8c86a68a7842d213d6
(x86: PAT infrastructure patch).

CC: Dave Jones <[email protected]>
CC: Venkatesh Pallipadi <[email protected]>
Signed-off-by: Andreas Herrmann <[email protected]>
---
arch/x86/kernel/cpu/addon_cpuid_features.c | 19 +++++--------------
1 files changed, 5 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kernel/cpu/addon_cpuid_features.c b/arch/x86/kernel/cpu/addon_cpuid_features.c
index d8b3e4a..3a66ef4 100644
--- a/arch/x86/kernel/cpu/addon_cpuid_features.c
+++ b/arch/x86/kernel/cpu/addon_cpuid_features.c
@@ -53,22 +53,13 @@ void __cpuinit init_scattered_cpuid_features(struct cpuinfo_x86 *c)
#ifdef CONFIG_X86_PAT
void __cpuinit validate_pat_support(struct cpuinfo_x86 *c)
{
- switch (c->x86_vendor) {
- case X86_VENDOR_AMD:
- if (c->x86 >= 0xf && c->x86 <= 0x11)
- return;
- break;
- case X86_VENDOR_INTEL:
- if (c->x86 == 0xF || (c->x86 == 6 && c->x86_model >= 15))
- return;
- break;
- case X86_VENDOR_CENTAUR:
- case X86_VENDOR_TRANSMETA:
+ if (!cpu_has_pat) {
+ pat_disable("PAT not supported by CPU.");
return;
}

- pat_disable(cpu_has_pat ?
- "PAT disabled. Not yet verified on this CPU type." :
- "PAT not supported by CPU.");
+ if ((c->x86_vendor == X86_VENDOR_INTEL) &&
+ !(c->x86 == 0xF || (c->x86 == 6 && c->x86_model >= 15)))
+ pat_disable("PAT disabled. Not yet verified on this CPU type.");
}
#endif
--
1.5.5.3


Subject: Re: [PATCH 2/5] x86: PAT: fix ambiguous paranoia check in pat_init()

On Wed, Jun 11, 2008 at 02:58:49PM +0200, Rene Herman wrote:
> On 11-06-08 11:41, Andreas Herrmann wrote:
>
>> As I had no Transmeta or Centaur CPU at hand I just cleared the PAT
>> flag in the CPU identification code to simulate the case that all CPUs
>> of a Vendor are whitelisted (even those w/o PAT support). The first
>> time pat_init() is entered you get
>> PAT enabled, but CPU feature cleared (=> which is wrong as no flag
>> was cleared)
>
> Again, you are misreading this. Please just replace the message mentally by
> "PAT enabled, but CPU claims to not support PAT". "cleared" here does not
> signify that we ourselves cleared anything, just that flag IS clear
> (unset). Yes, maybe the wording could be better but it's not wrong.

Well, wording might not be best. But I don't care anymore.
(Just wondering which CPUs are out there that support PAT but don't
advertise it with any feature flag.)

>> x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
>> (=> which is wrong as PAT shouldn't be enabled on such CPUs)
>
> Again not wrong, or at least by design. Thomas Gleixner did it this way and
> with that "paranoia check" explicitly bombing out only for SMP this
> wouldn't have been by accident. He no doubt knows why he did so (and he's
> in CC so if we're real lucky we might also now...)

I guess at the time Thomas' patch was commited this was just fine.

But with the recent Transmeta/Centaur patch, validate_pat_support()
returns w/o disabling PAT even for such vendor's CPUs that don't
support PAT,

To prevent this, validate_pat_support() should check for cpu_has_pat
in addition to any other white-black-or-whatsoever-listing.


Regards,

Andreas

2008-06-11 16:46:32

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] x86: enable PAT on (almost) all CPUs that advertise it

Andreas Herrmann wrote:
>
> I just kept the "whitelist" for Intel CPUs. It is similar to what was
> submitted by Venkatesh in the first place.
> See pat_known_cpu() in commit 2e5d9c857d4e6c9e7b7d8c8c86a68a7842d213d6
> (x86: PAT infrastructure patch).
>

Yes, but the "whitelist" for Intel CPUs cover all CPUs which ever
flagged the PAT bit, so it should be nuked, too.

-hpa

2008-06-11 17:36:29

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 2/5] x86: PAT: fix ambiguous paranoia check in pat_init()

Andreas Herrmann wrote:
>
> For Intel all family 0xf CPUs, and family 6 CPUs starting with model
> 15 are whitelisted.
>

Arg, sorry, I read the code brokenly. You're right, of course.

> There seem to be other Intel CPUs that advertise PAT support.
> See cpuinfo output at http://gentoo-wiki.com/Safe_Cflags
> E.g. Pentium M (model 13), Celeron (model 6), Pentium III (model 8).
> (Not sure how correct this information is, though)

Yes, those all have PAT enabled.

> Turn the white- into a blacklist for those remaining CPUs until they
> are verified?

Yes, that would be a suitable course of action; moving it to
Intel-specific code. Then the Intel people can worry about lifting the
remaining blacklisted CPUs.

-hpa

2008-06-12 04:32:43

by Rene Herman

[permalink] [raw]
Subject: Re: [PATCH 2/5] x86: PAT: fix ambiguous paranoia check in pat_init()

On 11-06-08 18:12, Andreas Herrmann wrote:

>> Again not wrong, or at least by design. Thomas Gleixner did it this way and
>> with that "paranoia check" explicitly bombing out only for SMP this
>> wouldn't have been by accident. He no doubt knows why he did so (and he's
>> in CC so if we're real lucky we might also now...)
>
> I guess at the time Thomas' patch was commited this was just fine.
>
> But with the recent Transmeta/Centaur patch, validate_pat_support()
> returns w/o disabling PAT even for such vendor's CPUs that don't
> support PAT,

In a sense that recent patch in the x86 tree could be consired the buggy
one as it fails to explicitly whitelist those models with functional PAT
while THAT was the setup of things here -- but yes, don't get me wrong,
I also think that setup wasn't particularly great.

Your followup patch turns the whitelist into a blacklist, blacklisting
those Intel models which weren't specifically whitelisted before, which
is a saner approach, so <shrug>. If things are ready for that, all the
better.

Rene.