2014-10-07 00:19:54

by Bryan O'Donoghue

[permalink] [raw]
Subject: [PATCH v3 0/2] Fix legacy_cache detect on Intel. Add Quark to legacy_cache, document TLB quirk

These two patches cover four things.

First add legacy_cache callback into init_intel() to enable calling of
intel_size_cache() re-enabling detection of PIII Tualatin cache size
and allowing Quark SoC X1000 hooked into this mechanism to similarly
report it's cache size.

Second adding of Quark SoC X1000 to the legacy_cache callback so that
it too will report correct cache size. In this case a 4-way set
associative cache with a 16 byte cache line and 256 lines per tag.

Third add the Quark SoC X1000 string so that /proc/cpuinfo gives the
correct description of the processor in the "model name" field.

Finally documentation of early TLB flushing behaviour on Quark before
cpu_has_pge() has been switched off. For clarity the existing code works
fine on Quark. This change to setup.c documents the minutiae of why.

Bryan O'Donoghue (2):
x86: Quark: Comment setup_arch() to document TLB/PGE behaviour
x86: Add cpu_detect_cache_sizes to init_intel() add Quark to
legacy_cache()

arch/x86/kernel/cpu/intel.c | 17 ++++++++++++++++-
arch/x86/kernel/setup.c | 9 +++++++++
2 files changed, 25 insertions(+), 1 deletion(-)

--
1.9.1


2014-10-07 00:19:55

by Bryan O'Donoghue

[permalink] [raw]
Subject: [PATCH 1/2] x86: Quark: Comment setup_arch() to document TLB/PGE bug

Quark SoC X1000 advertises Page Global Enable for it's
Translation Lookaside Buffer via cpuid. The silicon does not
in fact support PGE and hence will not flush the TLB when CR4.PGE
is rewritten. The Quark documentation makes clear the necessity to
instead rewrite CR3 in order to flush any TLB entries, irrespective
of the state of CR4.PGE or an individual PTE.PGE

See Intel Quark Core DevMan_001.pdf section 6.4.11

In setup.c setup_arch() the code will load_cr3() and then do a
__flush_tlb_all().

On Quark the entire TLB will be flushed at the load_cr3().
The __flush_tlb_all() have no effect and can be safely ignored.

Later on in the boot process we switch off the flag for cpu_has_pge()
which means that subsequent calls to __flush_tlb_all() will
call __flush_tlb() not __flush_tlb_global() flushing the TLB in the
correct way via load_cr3() not CR4.PGE rewrite

This patch documents the behaviour of flushing the TLB for Quark in
setup_arch()

Comment text suggested by Thomas Gleixner

Signed-off-by: Bryan O'Donoghue <[email protected]>
---
arch/x86/kernel/setup.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 41ead8d..235cfd3 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -879,6 +879,15 @@ void __init setup_arch(char **cmdline_p)
KERNEL_PGD_PTRS);

load_cr3(swapper_pg_dir);
+ /*
+ * Note: Quark X1000 CPUs advertise PGE incorrectly and require
+ * a cr3 based tlb flush, so the following __flush_tlb_all()
+ * will not flush anything because the cpu quirk which clears
+ * X86_FEATURE_PGE has not been invoked yet. Though due to the
+ * load_cr3() above the TLB has been flushed already. The
+ * quirk is invoked before subsequent calls to __flush_tlb_all()
+ * so proper operation is guaranteed.
+ */
__flush_tlb_all();
#else
printk(KERN_INFO "Command line: %s\n", boot_command_line);
--
1.9.1

2014-10-07 00:20:43

by Bryan O'Donoghue

[permalink] [raw]
Subject: [PATCH v2 2/2] x86: Add cpu_detect_cache_sizes to init_intel() add Quark legacy_cache()

Intel processors which don't report cache information via cpuid(2)
or cpuid(4) need quirk code in the legacy_cache_size callback to
report this data. For Intel that callback is is intel_size_cache().

This patch enables calling of cpu_detect_cache_sizes() inside of
init_intel() and hence the calling of the legacy_cache callback in
intel_size_cache(). Adding this call will ensure that PIII Tualatin
currently in intel_size_cache() and Quark SoC X1000 being added to
intel_size_cache() in this patch will report their respective cache
sizes.

This model of calling cpu_detect_cache_sizes() is consistent with
AMD/Via/Cirix/Transmeta and Centaur.

Also added is a string to idenitfy the Quark as Quark SoC X1000
giving better and more descriptive output via /proc/cpuinfo

Adding cpu_detect_cache_sizes to init_intel() will enable calling
of intel_size_cache() on Intel processors which currently no code
can reach. Therefore this patch will also re-enable reporting
of PIII Tualatin cache size information as well as add
Quark SoC X1000 support.

Comment text and cache flow logic suggested by Thomas Gleixner

Signed-off-by: Bryan O'Donoghue <[email protected]>
---
arch/x86/kernel/cpu/intel.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 74e804d..6fd5be3 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -382,6 +382,13 @@ static void init_intel(struct cpuinfo_x86 *c)
}

l2 = init_intel_cacheinfo(c);
+
+ /* Detect legacy cache sizes if init_intel_cacheinfo did not */
+ if (l2 == 0) {
+ cpu_detect_cache_sizes(c);
+ l2 = c->x86_cache_size;
+ }
+
if (c->cpuid_level > 9) {
unsigned eax = cpuid_eax(10);
/* Check for version and the number of counters */
@@ -485,6 +492,13 @@ static unsigned int intel_size_cache(struct cpuinfo_x86 *c, unsigned int size)
*/
if ((c->x86 == 6) && (c->x86_model == 11) && (size == 0))
size = 256;
+
+ /*
+ * Intel Quark SoC X1000 contains a 4-way set associative
+ * 16K cache with a 16 byte cache line and 256 lines per tag
+ */
+ if ((c->x86 == 5) && (c->x86_model == 9))
+ size = 16;
return size;
}
#endif
@@ -686,7 +700,8 @@ static const struct cpu_dev intel_cpu_dev = {
[3] = "OverDrive PODP5V83",
[4] = "Pentium MMX",
[7] = "Mobile Pentium 75 - 200",
- [8] = "Mobile Pentium MMX"
+ [8] = "Mobile Pentium MMX",
+ [9] = "Quark SoC X1000",
}
},
{ .family = 6, .model_names =
--
1.9.1

Subject: [tip:x86/platform] x86: Quark: Comment setup_arch() to document TLB/PGE bug

Commit-ID: 2075244f9b871f18a007935c73d2ab49d4fb43e0
Gitweb: http://git.kernel.org/tip/2075244f9b871f18a007935c73d2ab49d4fb43e0
Author: Bryan O'Donoghue <[email protected]>
AuthorDate: Tue, 7 Oct 2014 01:19:48 +0100
Committer: Thomas Gleixner <[email protected]>
CommitDate: Wed, 8 Oct 2014 10:07:46 +0200

x86: Quark: Comment setup_arch() to document TLB/PGE bug

Quark SoC X1000 advertises Page Global Enable for it's
Translation Lookaside Buffer via cpuid. The silicon does not
in fact support PGE and hence will not flush the TLB when CR4.PGE
is rewritten. The Quark documentation makes clear the necessity to
instead rewrite CR3 in order to flush any TLB entries, irrespective
of the state of CR4.PGE or an individual PTE.PGE

See Intel Quark Core DevMan_001.pdf section 6.4.11

In setup.c setup_arch() the code will load_cr3() and then do a
__flush_tlb_all().

On Quark the entire TLB will be flushed at the load_cr3().
The __flush_tlb_all() have no effect and can be safely ignored.

Later on in the boot process we switch off the flag for cpu_has_pge()
which means that subsequent calls to __flush_tlb_all() will
call __flush_tlb() not __flush_tlb_global() flushing the TLB in the
correct way via load_cr3() not CR4.PGE rewrite

This patch documents the behaviour of flushing the TLB for Quark in
setup_arch()

Comment text suggested by Thomas Gleixner

Signed-off-by: Bryan O'Donoghue <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
---
arch/x86/kernel/setup.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 41ead8d..235cfd3 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -879,6 +879,15 @@ void __init setup_arch(char **cmdline_p)
KERNEL_PGD_PTRS);

load_cr3(swapper_pg_dir);
+ /*
+ * Note: Quark X1000 CPUs advertise PGE incorrectly and require
+ * a cr3 based tlb flush, so the following __flush_tlb_all()
+ * will not flush anything because the cpu quirk which clears
+ * X86_FEATURE_PGE has not been invoked yet. Though due to the
+ * load_cr3() above the TLB has been flushed already. The
+ * quirk is invoked before subsequent calls to __flush_tlb_all()
+ * so proper operation is guaranteed.
+ */
__flush_tlb_all();
#else
printk(KERN_INFO "Command line: %s\n", boot_command_line);

Subject: [tip:x86/platform] x86: Add cpu_detect_cache_sizes to init_intel( ) add Quark legacy_cache()

Commit-ID: aece118e487a744eafcdd0c77fe32b55ee2092a1
Gitweb: http://git.kernel.org/tip/aece118e487a744eafcdd0c77fe32b55ee2092a1
Author: Bryan O'Donoghue <[email protected]>
AuthorDate: Tue, 7 Oct 2014 01:19:49 +0100
Committer: Thomas Gleixner <[email protected]>
CommitDate: Wed, 8 Oct 2014 10:07:46 +0200

x86: Add cpu_detect_cache_sizes to init_intel() add Quark legacy_cache()

Intel processors which don't report cache information via cpuid(2)
or cpuid(4) need quirk code in the legacy_cache_size callback to
report this data. For Intel that callback is is intel_size_cache().

This patch enables calling of cpu_detect_cache_sizes() inside of
init_intel() and hence the calling of the legacy_cache callback in
intel_size_cache(). Adding this call will ensure that PIII Tualatin
currently in intel_size_cache() and Quark SoC X1000 being added to
intel_size_cache() in this patch will report their respective cache
sizes.

This model of calling cpu_detect_cache_sizes() is consistent with
AMD/Via/Cirix/Transmeta and Centaur.

Also added is a string to idenitfy the Quark as Quark SoC X1000
giving better and more descriptive output via /proc/cpuinfo

Adding cpu_detect_cache_sizes to init_intel() will enable calling
of intel_size_cache() on Intel processors which currently no code
can reach. Therefore this patch will also re-enable reporting
of PIII Tualatin cache size information as well as add
Quark SoC X1000 support.

Comment text and cache flow logic suggested by Thomas Gleixner

Signed-off-by: Bryan O'Donoghue <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
---
arch/x86/kernel/cpu/intel.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 50ce751..1ef4562 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -397,6 +397,13 @@ static void init_intel(struct cpuinfo_x86 *c)
}

l2 = init_intel_cacheinfo(c);
+
+ /* Detect legacy cache sizes if init_intel_cacheinfo did not */
+ if (l2 == 0) {
+ cpu_detect_cache_sizes(c);
+ l2 = c->x86_cache_size;
+ }
+
if (c->cpuid_level > 9) {
unsigned eax = cpuid_eax(10);
/* Check for version and the number of counters */
@@ -500,6 +507,13 @@ static unsigned int intel_size_cache(struct cpuinfo_x86 *c, unsigned int size)
*/
if ((c->x86 == 6) && (c->x86_model == 11) && (size == 0))
size = 256;
+
+ /*
+ * Intel Quark SoC X1000 contains a 4-way set associative
+ * 16K cache with a 16 byte cache line and 256 lines per tag
+ */
+ if ((c->x86 == 5) && (c->x86_model == 9))
+ size = 16;
return size;
}
#endif
@@ -701,7 +715,8 @@ static const struct cpu_dev intel_cpu_dev = {
[3] = "OverDrive PODP5V83",
[4] = "Pentium MMX",
[7] = "Mobile Pentium 75 - 200",
- [8] = "Mobile Pentium MMX"
+ [8] = "Mobile Pentium MMX",
+ [9] = "Quark SoC X1000",
}
},
{ .family = 6, .model_names =