2018-04-26 10:49:24

by David Wang

[permalink] [raw]
Subject: [PATCH v3] report correct CPU/cache topology

Centaur CPUs enumerate the cache topology in the same way as Intel CPUs,
but the function is unused so far.
The Centaur init code also misses to initialize x86_info::max_cores, so
the CPU topology can't be described correctly.

Initialize x86_info::max_cores and invoke init_intel_cachinfo() to make
CPU and cache topology information avaliable and correct

Signed-off-by: David Wang <[email protected]>

Changes from v2 to v3:
*1 define new detect_num_cpu_cores() in common.c to replace the original
intel_num_cpu_cores;
*2 move cpu_detect_cache_sizes inside init_intel_cacheinfo.

Changes from v1 to v2:
*1 replace centaur_num_cpu_cores with early_init_centaur_mc according to
suggestions from tglx;
*2 call cpu_detect_cache_sizes when init_intel_cacheinfo returns 0. For
some very old Centaur CPUs can't support the mechanisms defined in init_
intel_cacheinfo.

---
arch/x86/include/asm/processor.h | 1 +
arch/x86/kernel/cpu/centaur.c | 5 +++++
arch/x86/kernel/cpu/common.c | 14 ++++++++++++++
arch/x86/kernel/cpu/intel.c | 28 ++--------------------------
arch/x86/kernel/cpu/intel_cacheinfo.c | 6 ++++++
5 files changed, 28 insertions(+), 26 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 21a1149..96e9070 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -192,6 +192,7 @@ extern u32 get_scattered_cpuid_leaf(unsigned int level,
enum cpuid_regs_idx reg);
extern unsigned int init_intel_cacheinfo(struct cpuinfo_x86 *c);
extern void init_amd_cacheinfo(struct cpuinfo_x86 *c);
+extern int detect_num_cpu_cores(struct cpuinfo_x86 *c);

extern void detect_extended_topology(struct cpuinfo_x86 *c);
extern void detect_ht(struct cpuinfo_x86 *c);
diff --git a/arch/x86/kernel/cpu/centaur.c b/arch/x86/kernel/cpu/centaur.c
index 80d5110..c265494 100644
--- a/arch/x86/kernel/cpu/centaur.c
+++ b/arch/x86/kernel/cpu/centaur.c
@@ -160,6 +160,11 @@ static void init_centaur(struct cpuinfo_x86 *c)
clear_cpu_cap(c, 0*32+31);
#endif
early_init_centaur(c);
+ init_intel_cacheinfo(c);
+ c->x86_max_cores = detect_num_cpu_cores(c);
+#ifdef CONFIG_X86_32
+ detect_ht(c);
+#endif

if (c->cpuid_level > 9) {
unsigned int eax = cpuid_eax(10);
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 8a5b185..0c80d50 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -577,6 +577,20 @@ static void get_model_name(struct cpuinfo_x86 *c)
*(s + 1) = '\0';
}

+int detect_num_cpu_cores(struct cpuinfo_x86 *c)
+{
+ unsigned int eax, ebx, ecx, edx;
+
+ if (!IS_ENABLED(CONFIG_SMP) || c->cpuid_level < 4)
+ return 1;
+
+ cpuid_count(4, 0, &eax, &ebx, &ecx, &edx);
+ if (eax & 0x1f)
+ return (eax >> 26) + 1;
+ else
+ return 1;
+}
+
void cpu_detect_cache_sizes(struct cpuinfo_x86 *c)
{
unsigned int n, dummy, ebx, ecx, edx, l2size;
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index b9693b8..e1138da 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -453,24 +453,6 @@ static void srat_detect_node(struct cpuinfo_x86 *c)
#endif
}

-/*
- * find out the number of processor cores on the die
- */
-static int intel_num_cpu_cores(struct cpuinfo_x86 *c)
-{
- unsigned int eax, ebx, ecx, edx;
-
- if (!IS_ENABLED(CONFIG_SMP) || c->cpuid_level < 4)
- return 1;
-
- /* Intel has a non-standard dependency on %ecx for this CPUID level. */
- cpuid_count(4, 0, &eax, &ebx, &ecx, &edx);
- if (eax & 0x1f)
- return (eax >> 26) + 1;
- else
- return 1;
-}
-
static void detect_vmx_virtcap(struct cpuinfo_x86 *c)
{
/* Intel VMX MSR indicated features */
@@ -671,19 +653,13 @@ static void init_intel(struct cpuinfo_x86 *c)
* let's use the legacy cpuid vector 0x1 and 0x4 for topology
* detection.
*/
- c->x86_max_cores = intel_num_cpu_cores(c);
+ c->x86_max_cores = detect_num_cpu_cores(c);
#ifdef CONFIG_X86_32
detect_ht(c);
#endif
}

- l2 = init_intel_cacheinfo(c);
-
- /* Detect legacy cache sizes if init_intel_cacheinfo did not */
- if (l2 == 0) {
- cpu_detect_cache_sizes(c);
- l2 = c->x86_cache_size;
- }
+ init_intel_cacheinfo(c);

if (c->cpuid_level > 9) {
unsigned eax = cpuid_eax(10);
diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c
index 54d04d5..2a0597c 100644
--- a/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -20,6 +20,8 @@
#include <asm/amd_nb.h>
#include <asm/smp.h>

+#include "cpu.h"
+
#define LVL_1_INST 1
#define LVL_1_DATA 2
#define LVL_2 3
@@ -802,6 +804,10 @@ unsigned int init_intel_cacheinfo(struct cpuinfo_x86 *c)

c->x86_cache_size = l3 ? l3 : (l2 ? l2 : (l1i+l1d));

+ if (!l2) {
+ cpu_detect_cache_sizes(c);
+ l2 = c->x86_cache_size;
+ }
return l2;
}

--
1.9.1



2018-04-26 11:58:22

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v3] report correct CPU/cache topology



On Thu, 26 Apr 2018, David Wang wrote:

> Centaur CPUs enumerate the cache topology in the same way as Intel CPUs,
> but the function is unused so far.
> The Centaur init code also misses to initialize x86_info::max_cores, so
> the CPU topology can't be described correctly.
>
> Initialize x86_info::max_cores and invoke init_intel_cachinfo() to make
> CPU and cache topology information avaliable and correct

Now that looks pretty good.

> Signed-off-by: David Wang <[email protected]>
>
> Changes from v2 to v3:
> *1 define new detect_num_cpu_cores() in common.c to replace the original
> intel_num_cpu_cores;
> *2 move cpu_detect_cache_sizes inside init_intel_cacheinfo.

But I asked for that being a separate patch with a separate changelog. And
the intel_cache_info() change wants to be in a separate patch as well. Then
the third patch is the one which makes use of these changes for
centaur.

Please read review comments carefully and rather ask when you have doubts
about the meaning.

Thanks,

tglx



2018-05-02 07:15:37

by David Wang

[permalink] [raw]
Subject: Re: [PATCH v3] report correct CPU/cache topology



> -----Original Mail-----
> Sender: Thomas Gleixner [mailto:[email protected]]
> Time: 2018??4??26?? 19:56
> Receiver: David Wang <[email protected]>
> CC: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; brucechang@via-
> alliance.com; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected]
> Subject: Re: [PATCH v3] report correct CPU/cache topology
>
>
>
> On Thu, 26 Apr 2018, David Wang wrote:
>
> > Centaur CPUs enumerate the cache topology in the same way as Intel
> > CPUs, but the function is unused so far.
> > The Centaur init code also misses to initialize x86_info::max_cores,
> > so the CPU topology can't be described correctly.
> >
> > Initialize x86_info::max_cores and invoke init_intel_cachinfo() to
> > make CPU and cache topology information avaliable and correct
>
> Now that looks pretty good.
>
> > Signed-off-by: David Wang <[email protected]>
> >
> > Changes from v2 to v3:
> > *1 define new detect_num_cpu_cores() in common.c to replace the
> > original intel_num_cpu_cores;
> > *2 move cpu_detect_cache_sizes inside init_intel_cacheinfo.
>
> But I asked for that being a separate patch with a separate changelog. And
> the intel_cache_info() change wants to be in a separate patch as well.
Then
> the third patch is the one which makes use of these changes for centaur.
>
> Please read review comments carefully and rather ask when you have
> doubts about the meaning.
>
> Thanks,
>
> tglx
>
Sorry!
I will split the changes to three separate patches.
Thank you.
---
David