Subject: [PATCH 0/5 v3] x86: adapt CPU topology detection for AMD Magny-Cours

Changes to previous patch set:
- remove MULTI_NODE_CPU config option
- provide defaults for cpu_node topology information
- add patch to fix AMD mcheck code

Current patch set contains 5 patches:
- patch 1 adapts common code to show cpu_node_id,
cpu_node_siblings and cpu_node_siblings_list in
/sys/devices/system/cpu/cpu*/topology
- patch 2 prepares arch/x86 to provide cpu_node information
- patch 3 sets up cpu_node information for AMD Magny-Cours CPU
- patch 4 fixes L3 cache information for Magny-Cours
- patch 5 fixes mcheck code for Magny-Cours

Patches are against tip/master.
Please apply.


Thanks,

Andreas

PS: See http://lkml.org/lkml/2009/5/29/263 for previous patch
submission.


Subject: [PATCH 1/5] topology: introduce cpu_node information for multi-node processors


New topology attributes are
- cpu_node_id (id of the internal node)
- cpu_node_siblings and cpu_node_siblings_list
(siblings on the same internal node)

Signed-off-by: Andreas Herrmann <[email protected]>
---
drivers/base/topology.c | 10 ++++++++++
include/linux/topology.h | 9 +++++++++
2 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/drivers/base/topology.c b/drivers/base/topology.c
index bf6b132..1e35a43 100644
--- a/drivers/base/topology.c
+++ b/drivers/base/topology.c
@@ -103,6 +103,9 @@ static ssize_t show_##name##_list(struct sys_device *dev, \
define_id_show_func(physical_package_id);
define_one_ro(physical_package_id);

+define_id_show_func(cpu_node_id);
+define_one_ro(cpu_node_id);
+
define_id_show_func(core_id);
define_one_ro(core_id);

@@ -110,6 +113,10 @@ define_siblings_show_func(thread_cpumask);
define_one_ro_named(thread_siblings, show_thread_cpumask);
define_one_ro_named(thread_siblings_list, show_thread_cpumask_list);

+define_siblings_show_func(cpu_node_cpumask);
+define_one_ro_named(cpu_node_siblings, show_cpu_node_cpumask);
+define_one_ro_named(cpu_node_siblings_list, show_cpu_node_cpumask_list);
+
define_siblings_show_func(core_cpumask);
define_one_ro_named(core_siblings, show_core_cpumask);
define_one_ro_named(core_siblings_list, show_core_cpumask_list);
@@ -119,6 +126,9 @@ static struct attribute *default_attrs[] = {
&attr_core_id.attr,
&attr_thread_siblings.attr,
&attr_thread_siblings_list.attr,
+ &attr_cpu_node_id.attr,
+ &attr_cpu_node_siblings.attr,
+ &attr_cpu_node_siblings_list.attr,
&attr_core_siblings.attr,
&attr_core_siblings_list.attr,
NULL
diff --git a/include/linux/topology.h b/include/linux/topology.h
index 7402c1a..976a130 100644
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -180,6 +180,9 @@ int arch_update_cpu_topology(void);
#ifndef topology_physical_package_id
#define topology_physical_package_id(cpu) ((void)(cpu), -1)
#endif
+#ifndef topology_cpu_node_id
+#define topology_cpu_node_id(cpu) ((void)(cpu), 0)
+#endif
#ifndef topology_core_id
#define topology_core_id(cpu) ((void)(cpu), 0)
#endif
@@ -189,12 +192,18 @@ int arch_update_cpu_topology(void);
#ifndef topology_core_siblings
#define topology_core_siblings(cpu) cpumask_of_cpu(cpu)
#endif
+#ifndef topology_cpu_node_siblings
+#define topology_cpu_node_siblings(cpu) topology_core_siblings(cpu)
+#endif
#ifndef topology_thread_cpumask
#define topology_thread_cpumask(cpu) cpumask_of(cpu)
#endif
#ifndef topology_core_cpumask
#define topology_core_cpumask(cpu) cpumask_of(cpu)
#endif
+#ifndef topology_cpu_node_cpumask
+#define topology_cpu_node_cpumask(cpu) topology_core_cpumask(cpu)
+#endif

/* Returns the number of the current Node. */
#ifndef numa_node_id
--
1.6.3.1


Subject: [PATCH 2/5] x86: provide CPU topology information for multi-node processors

Provide topology_cpu_node_id, topology_cpu_node_mask and cpu_node_map.
CPUs with matching phys_proc_id and cpu_node_id belong to the same
cpu_node.

Signed-off-by: Andreas Herrmann <[email protected]>
---
arch/x86/include/asm/processor.h | 2 ++
arch/x86/include/asm/smp.h | 6 ++++++
arch/x86/include/asm/topology.h | 2 ++
arch/x86/kernel/cpu/common.c | 2 ++
arch/x86/kernel/cpu/proc.c | 1 +
arch/x86/kernel/smpboot.c | 10 ++++++++++
6 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index c776826..74d9258 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -106,6 +106,8 @@ struct cpuinfo_x86 {
u16 booted_cores;
/* Physical processor id: */
u16 phys_proc_id;
+ /* Node id in case of multi-node processor: */
+ u16 cpu_node_id;
/* Core id: */
u16 cpu_core_id;
/* Index into per_cpu list: */
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 6a84ed1..aad37c6 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -22,6 +22,7 @@ extern int smp_num_siblings;
extern unsigned int num_processors;

DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map);
+DECLARE_PER_CPU(cpumask_var_t, cpu_node_map);
DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
DECLARE_PER_CPU(u16, cpu_llc_id);
DECLARE_PER_CPU(int, cpu_number);
@@ -31,6 +32,11 @@ static inline struct cpumask *cpu_sibling_mask(int cpu)
return per_cpu(cpu_sibling_map, cpu);
}

+static inline struct cpumask *cpu_node_mask(int cpu)
+{
+ return per_cpu(cpu_node_map, cpu);
+}
+
static inline struct cpumask *cpu_core_mask(int cpu)
{
return per_cpu(cpu_core_map, cpu);
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 066ef59..9eddb69 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -190,6 +190,8 @@ extern const struct cpumask *cpu_coregroup_mask(int cpu);
#define topology_core_id(cpu) (cpu_data(cpu).cpu_core_id)
#define topology_core_cpumask(cpu) (per_cpu(cpu_core_map, cpu))
#define topology_thread_cpumask(cpu) (per_cpu(cpu_sibling_map, cpu))
+#define topology_cpu_node_id(cpu) (cpu_data(cpu).cpu_node_id)
+#define topology_cpu_node_cpumask(cpu) (per_cpu(cpu_node_map, cpu))

/* indicates that pointers to the topology cpumask_t maps are valid */
#define arch_provides_topology_pointers yes
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 3ffdcfa..d14d074 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -478,6 +478,8 @@ out:
if ((c->x86_max_cores * smp_num_siblings) > 1) {
printk(KERN_INFO "CPU: Physical Processor ID: %d\n",
c->phys_proc_id);
+ printk(KERN_INFO "CPU: Processor Node ID: %d\n",
+ c->cpu_node_id);
printk(KERN_INFO "CPU: Processor Core ID: %d\n",
c->cpu_core_id);
}
diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
index d5e3039..ff539c1 100644
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -15,6 +15,7 @@ static void show_cpuinfo_core(struct seq_file *m, struct cpuinfo_x86 *c,
seq_printf(m, "physical id\t: %d\n", c->phys_proc_id);
seq_printf(m, "siblings\t: %d\n",
cpumask_weight(cpu_core_mask(cpu)));
+ seq_printf(m, "node id\t\t: %d\n", c->cpu_node_id);
seq_printf(m, "core id\t\t: %d\n", c->cpu_core_id);
seq_printf(m, "cpu cores\t: %d\n", c->booted_cores);
seq_printf(m, "apicid\t\t: %d\n", c->apicid);
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index d2e8de9..d9750ad 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -108,6 +108,10 @@ EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
DEFINE_PER_CPU(cpumask_var_t, cpu_core_map);
EXPORT_PER_CPU_SYMBOL(cpu_core_map);

+/* representing node silbings on multi-node CPU */
+DEFINE_PER_CPU(cpumask_var_t, cpu_node_map);
+EXPORT_PER_CPU_SYMBOL(cpu_node_map);
+
/* Per CPU bogomips and other parameters */
DEFINE_PER_CPU_SHARED_ALIGNED(struct cpuinfo_x86, cpu_info);
EXPORT_PER_CPU_SYMBOL(cpu_info);
@@ -401,6 +405,11 @@ void __cpuinit set_cpu_sibling_map(int cpu)
cpumask_set_cpu(i, c->llc_shared_map);
cpumask_set_cpu(cpu, cpu_data(i).llc_shared_map);
}
+ if ((c->phys_proc_id == cpu_data(i).phys_proc_id) &&
+ (c->cpu_node_id == cpu_data(i).cpu_node_id)) {
+ cpumask_set_cpu(i, cpu_node_mask(cpu));
+ cpumask_set_cpu(cpu, cpu_node_mask(i));
+ }
if (c->phys_proc_id == cpu_data(i).phys_proc_id) {
cpumask_set_cpu(i, cpu_core_mask(cpu));
cpumask_set_cpu(cpu, cpu_core_mask(i));
@@ -1218,6 +1227,7 @@ static void remove_siblinginfo(int cpu)
cpumask_clear(cpu_sibling_mask(cpu));
cpumask_clear(cpu_core_mask(cpu));
c->phys_proc_id = 0;
+ c->cpu_node_id = 0;
c->cpu_core_id = 0;
cpumask_clear_cpu(cpu, cpu_sibling_setup_mask);
}
--
1.6.3.1


Subject: [PATCH 3/5] x86: add cpu_node topology detection for AMD Magny-Cours

This adapts CPU topology detection for AMD Magny-Cours.

Here is example output from two cores on same package but different
internal cpu_nodes:

/sys/devices/system/cpu/cpu5:
physical_package_id : 0
core_id : 5
thread_siblings : 00000020
thread_siblings_list : 5
cpu_node_id : 0
cpu_node_siblings : 0000003f
cpu_node_siblings_list : 0-5
core_siblings : 00000fff
core_siblings_list : 0-11
/sys/devices/system/cpu/cpu6:
physical_package_id : 0
core_id : 0
thread_siblings : 00000040
thread_siblings_list : 6
cpu_node_id : 1
cpu_node_siblings : 00000fc0
cpu_node_siblings_list : 6-11
core_siblings : 00000fff
core_siblings_list : 0-11

Signed-off-by: Andreas Herrmann <[email protected]>
---
arch/x86/include/asm/cpufeature.h | 1 +
arch/x86/kernel/cpu/amd.c | 61 +++++++++++++++++++++++++++++++++++++
2 files changed, 62 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 13cc6a5..7abe596 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -94,6 +94,7 @@
#define X86_FEATURE_TSC_RELIABLE (3*32+23) /* TSC is known to be reliable */
#define X86_FEATURE_NONSTOP_TSC (3*32+24) /* TSC does not stop in C states */
#define X86_FEATURE_CLFLUSH_MONITOR (3*32+25) /* "" clflush reqd with monitor */
+#define X86_FEATURE_AMD_DCM (3*32+26) /* multi-node processor */

/* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */
#define X86_FEATURE_XMM3 (4*32+ 0) /* "pni" SSE-3 */
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 728b375..feed057 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -6,6 +6,7 @@
#include <asm/processor.h>
#include <asm/apic.h>
#include <asm/cpu.h>
+#include <asm/pci-direct.h>

#ifdef CONFIG_X86_64
# include <asm/numa_64.h>
@@ -250,6 +251,56 @@ static int __cpuinit nearby_node(int apicid)
#endif

/*
+ * Fixup core topology information for AMD multi-node processors.
+ * Assumption 1: Number of cores in each internal node is the same.
+ * Assumption 2: Mixed systems with both single-node and dual-node
+ * processors are not supported.
+ */
+static void __cpuinit amd_fixup_dcm(struct cpuinfo_x86 *c)
+{
+ u32 t, cpn;
+ u8 n;
+
+ /* fixup topology information only once for a core */
+ if (cpu_has(c, X86_FEATURE_AMD_DCM))
+ return;
+
+ /* check for multi-node processor on boot cpu */
+ t = read_pci_config(0, 24, 3, 0xe8);
+ if (!(t & (1 << 29)))
+ return;
+
+ set_cpu_cap(c, X86_FEATURE_AMD_DCM);
+
+ /* cores per node: each internal node has half the number of cores */
+ cpn = c->x86_max_cores >> 1;
+
+ /* even-numbered NB_id of this dual-node processor */
+ n = c->phys_proc_id << 1;
+
+ /*
+ * determine internal node id and assign cores fifty-fifty to
+ * each node of the dual-node processor
+ */
+ t = read_pci_config(0, 24 + n, 3, 0xe8);
+ n = (t>>30) & 0x3;
+ if (n == 0) {
+ if (c->cpu_core_id < cpn)
+ c->cpu_node_id = 0;
+ else
+ c->cpu_node_id = 1;
+ } else {
+ if (c->cpu_core_id < cpn)
+ c->cpu_node_id = 1;
+ else
+ c->cpu_node_id = 0;
+ }
+
+ /* fixup core id to be in range from 0 to cpn */
+ c->cpu_core_id = c->cpu_core_id % cpn;
+}
+
+/*
* On a AMD dual core setup the lower bits of the APIC id distingush the cores.
* Assumes number of cores is a power of two.
*/
@@ -264,6 +315,9 @@ static void __cpuinit amd_detect_cmp(struct cpuinfo_x86 *c)
c->cpu_core_id = c->initial_apicid & ((1 << bits)-1);
/* Convert the initial APIC ID into the socket ID */
c->phys_proc_id = c->initial_apicid >> bits;
+ /* fixup topology information on multi-node processors */
+ if ((c->x86 == 0x10) && (c->x86_model == 9))
+ amd_fixup_dcm(c);
#endif
}

@@ -274,7 +328,14 @@ static void __cpuinit srat_detect_node(struct cpuinfo_x86 *c)
int node;
unsigned apicid = cpu_has_apic ? hard_smp_processor_id() : c->apicid;

+#ifdef CONFIG_MULTI_NODE_CPU
+ if (cpu_has(c, X86_FEATURE_AMD_DCM))
+ node = (c->phys_proc_id << 1) + c->cpu_node_id;
+ else
+ node = c->phys_proc_id;
+#else
node = c->phys_proc_id;
+#endif
if (apicid_to_node[apicid] != NUMA_NO_NODE)
node = apicid_to_node[apicid];
if (!node_online(node)) {
--
1.6.3.1


Subject: [PATCH 4/5] x86: cacheinfo: fixup L3 cache information for AMD Magny-Cours


L3 cache size, associativity and shared_cpu information need to be
adapted to show information for an internal node instead of the
entire physical package.

Signed-off-by: Andreas Herrmann <[email protected]>
---
arch/x86/kernel/cpu/intel_cacheinfo.c | 20 ++++++++++++++------
1 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c
index 789efe2..3b54a1e 100644
--- a/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -241,7 +241,7 @@ amd_cpuid4(int leaf, union _cpuid4_leaf_eax *eax,
case 0:
if (!l1->val)
return;
- assoc = l1->assoc;
+ assoc = assocs[l1->assoc];
line_size = l1->line_size;
lines_per_tag = l1->lines_per_tag;
size_in_kb = l1->size_in_kb;
@@ -249,7 +249,7 @@ amd_cpuid4(int leaf, union _cpuid4_leaf_eax *eax,
case 2:
if (!l2.val)
return;
- assoc = l2.assoc;
+ assoc = assocs[l2.assoc];
line_size = l2.line_size;
lines_per_tag = l2.lines_per_tag;
/* cpu_data has errata corrections for K7 applied */
@@ -258,10 +258,14 @@ amd_cpuid4(int leaf, union _cpuid4_leaf_eax *eax,
case 3:
if (!l3.val)
return;
- assoc = l3.assoc;
+ assoc = assocs[l3.assoc];
line_size = l3.line_size;
lines_per_tag = l3.lines_per_tag;
size_in_kb = l3.size_encoded * 512;
+ if (boot_cpu_has(X86_FEATURE_AMD_DCM)) {
+ size_in_kb = size_in_kb >> 1;
+ assoc = assoc >> 1;
+ }
break;
default:
return;
@@ -278,10 +282,10 @@ amd_cpuid4(int leaf, union _cpuid4_leaf_eax *eax,
eax->split.num_cores_on_die = current_cpu_data.x86_max_cores - 1;


- if (assoc == 0xf)
+ if (assoc == 0xffff)
eax->split.is_fully_associative = 1;
ebx->split.coherency_line_size = line_size - 1;
- ebx->split.ways_of_associativity = assocs[assoc] - 1;
+ ebx->split.ways_of_associativity = assoc - 1;
ebx->split.physical_line_partition = lines_per_tag - 1;
ecx->split.number_of_sets = (size_in_kb * 1024) / line_size /
(ebx->split.ways_of_associativity + 1) - 1;
@@ -598,7 +602,11 @@ static void __cpuinit get_cpu_leaves(void *_retval)
cache_remove_shared_cpu_map(cpu, i);
break;
}
- cache_shared_cpu_map_setup(cpu, j);
+ if (boot_cpu_has(X86_FEATURE_AMD_DCM))
+ cpumask_copy(to_cpumask(this_leaf->shared_cpu_map),
+ topology_cpu_node_cpumask(cpu));
+ else
+ cache_shared_cpu_map_setup(cpu, j);
}
}

--
1.6.3.1


Subject: [PATCH 5/5] x86: mcheck: make use of cpu_node_mask instead of cpu_core_mask to support multi-node processors

This fixes threshold_bank4 support.

We need to create symlinks for sibling shared banks to first core of
each internal node and not to first core of one internal node.

Signed-off-by: Andreas Herrmann <[email protected]>
---
arch/x86/kernel/cpu/mcheck/mce_amd_64.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd_64.c b/arch/x86/kernel/cpu/mcheck/mce_amd_64.c
index ddae216..595cbe5 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd_64.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd_64.c
@@ -494,7 +494,7 @@ static __cpuinit int threshold_create_bank(unsigned int cpu, unsigned int bank)

#ifdef CONFIG_SMP
if (cpu_data(cpu).cpu_core_id && shared_bank[bank]) { /* symlink */
- i = cpumask_first(cpu_core_mask(cpu));
+ i = cpumask_first(cpu_node_mask(cpu));

/* first core not up yet */
if (cpu_data(i).cpu_core_id)
@@ -514,7 +514,7 @@ static __cpuinit int threshold_create_bank(unsigned int cpu, unsigned int bank)
if (err)
goto out;

- cpumask_copy(b->cpus, cpu_core_mask(cpu));
+ cpumask_copy(b->cpus, cpu_node_mask(cpu));
per_cpu(threshold_banks, cpu)[bank] = b;

goto out;
@@ -539,7 +539,7 @@ static __cpuinit int threshold_create_bank(unsigned int cpu, unsigned int bank)
#ifndef CONFIG_SMP
cpumask_setall(b->cpus);
#else
- cpumask_copy(b->cpus, cpu_core_mask(cpu));
+ cpumask_copy(b->cpus, cpu_node_mask(cpu));
#endif

per_cpu(threshold_banks, cpu)[bank] = b;
--
1.6.3.1


2009-06-03 14:48:19

by Bert Wesarg

[permalink] [raw]
Subject: Re: [PATCH 3/5] x86: add cpu_node topology detection for AMD Magny-Cours

On Wed, Jun 3, 2009 at 16:35, Andreas Herrmann
<[email protected]> wrote:
> This adapts CPU topology detection for AMD Magny-Cours.
>
> Here is example output from two cores on same package but different
> internal cpu_nodes:
>
> /sys/devices/system/cpu/cpu5:
>  physical_package_id        : 0
>  core_id                    : 5
>  thread_siblings            : 00000020
>  thread_siblings_list       : 5
>  cpu_node_id                : 0
>  cpu_node_siblings          : 0000003f
>  cpu_node_siblings_list     : 0-5
>  core_siblings              : 00000fff
>  core_siblings_list         : 0-11
> /sys/devices/system/cpu/cpu6:
>  physical_package_id        : 0
>  core_id                    : 0
>  thread_siblings            : 00000040
>  thread_siblings_list       : 6
>  cpu_node_id                : 1
>  cpu_node_siblings          : 00000fc0
>  cpu_node_siblings_list     : 6-11
>  core_siblings              : 00000fff
>  core_siblings_list         : 0-11
>
> Signed-off-by: Andreas Herrmann <[email protected]>
> ---
>  arch/x86/include/asm/cpufeature.h |    1 +
>  arch/x86/kernel/cpu/amd.c         |   61 +++++++++++++++++++++++++++++++++++++
>  2 files changed, 62 insertions(+), 0 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index 728b375..feed057 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -274,7 +328,14 @@ static void __cpuinit srat_detect_node(struct cpuinfo_x86 *c)
>        int node;
>        unsigned apicid = cpu_has_apic ? hard_smp_processor_id() : c->apicid;
>
> +#ifdef CONFIG_MULTI_NODE_CPU
> +       if (cpu_has(c, X86_FEATURE_AMD_DCM))
> +               node = (c->phys_proc_id << 1) + c->cpu_node_id;
> +       else
> +               node = c->phys_proc_id;
> +#else
>        node = c->phys_proc_id;
> +#endif
Stale CONFIG_MULTI_NODE_CPU?

Regards,
Bert

>        if (apicid_to_node[apicid] != NUMA_NO_NODE)
>                node = apicid_to_node[apicid];
>        if (!node_online(node)) {
> --
> 1.6.3.1
>
>
>
>

2009-06-03 14:56:09

by Bert Wesarg

[permalink] [raw]
Subject: Re: [PATCH 1/5] topology: introduce cpu_node information for multi-node processors

On Wed, Jun 3, 2009 at 16:29, Andreas Herrmann
<[email protected]> wrote:
>
> New topology attributes are
> - cpu_node_id (id of the internal node)
> - cpu_node_siblings and cpu_node_siblings_list
>  (siblings on the same internal node)
Looks good. Thank you.

Acked-by: [email protected]

>
> Signed-off-by: Andreas Herrmann <[email protected]>
> ---
>  drivers/base/topology.c  |   10 ++++++++++
>  include/linux/topology.h |    9 +++++++++
>  2 files changed, 19 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/base/topology.c b/drivers/base/topology.c
> index bf6b132..1e35a43 100644
> --- a/drivers/base/topology.c
> +++ b/drivers/base/topology.c
> @@ -103,6 +103,9 @@ static ssize_t show_##name##_list(struct sys_device *dev,           \
>  define_id_show_func(physical_package_id);
>  define_one_ro(physical_package_id);
>
> +define_id_show_func(cpu_node_id);
> +define_one_ro(cpu_node_id);
> +
>  define_id_show_func(core_id);
>  define_one_ro(core_id);
>
> @@ -110,6 +113,10 @@ define_siblings_show_func(thread_cpumask);
>  define_one_ro_named(thread_siblings, show_thread_cpumask);
>  define_one_ro_named(thread_siblings_list, show_thread_cpumask_list);
>
> +define_siblings_show_func(cpu_node_cpumask);
> +define_one_ro_named(cpu_node_siblings, show_cpu_node_cpumask);
> +define_one_ro_named(cpu_node_siblings_list, show_cpu_node_cpumask_list);
> +
>  define_siblings_show_func(core_cpumask);
>  define_one_ro_named(core_siblings, show_core_cpumask);
>  define_one_ro_named(core_siblings_list, show_core_cpumask_list);
> @@ -119,6 +126,9 @@ static struct attribute *default_attrs[] = {
>        &attr_core_id.attr,
>        &attr_thread_siblings.attr,
>        &attr_thread_siblings_list.attr,
> +       &attr_cpu_node_id.attr,
> +       &attr_cpu_node_siblings.attr,
> +       &attr_cpu_node_siblings_list.attr,
>        &attr_core_siblings.attr,
>        &attr_core_siblings_list.attr,
>        NULL
> diff --git a/include/linux/topology.h b/include/linux/topology.h
> index 7402c1a..976a130 100644
> --- a/include/linux/topology.h
> +++ b/include/linux/topology.h
> @@ -180,6 +180,9 @@ int arch_update_cpu_topology(void);
>  #ifndef topology_physical_package_id
>  #define topology_physical_package_id(cpu)      ((void)(cpu), -1)
>  #endif
> +#ifndef topology_cpu_node_id
> +#define topology_cpu_node_id(cpu)              ((void)(cpu), 0)
> +#endif
>  #ifndef topology_core_id
>  #define topology_core_id(cpu)                  ((void)(cpu), 0)
>  #endif
> @@ -189,12 +192,18 @@ int arch_update_cpu_topology(void);
>  #ifndef topology_core_siblings
>  #define topology_core_siblings(cpu)            cpumask_of_cpu(cpu)
>  #endif
> +#ifndef topology_cpu_node_siblings
> +#define topology_cpu_node_siblings(cpu)                topology_core_siblings(cpu)
> +#endif
>  #ifndef topology_thread_cpumask
>  #define topology_thread_cpumask(cpu)           cpumask_of(cpu)
>  #endif
>  #ifndef topology_core_cpumask
>  #define topology_core_cpumask(cpu)             cpumask_of(cpu)
>  #endif
> +#ifndef topology_cpu_node_cpumask
> +#define topology_cpu_node_cpumask(cpu)         topology_core_cpumask(cpu)
> +#endif
>
>  /* Returns the number of the current Node. */
>  #ifndef numa_node_id
> --
> 1.6.3.1
>
>
>
>

Subject: [PATCH 3/5 retry] x86: add cpu_node topology detection for AMD Magny-Cours

On Wed, Jun 03, 2009 at 04:48:07PM +0200, Bert Wesarg wrote:
> On Wed, Jun 3, 2009 at 16:35, Andreas Herrmann
> <[email protected]> wrote:

> > diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> > index 728b375..feed057 100644
> > --- a/arch/x86/kernel/cpu/amd.c
> > +++ b/arch/x86/kernel/cpu/amd.c
> > @@ -274,7 +328,14 @@ static void __cpuinit srat_detect_node(struct cpuinfo_x86 *c)
> > ? ? ? ?int node;
> > ? ? ? ?unsigned apicid = cpu_has_apic ? hard_smp_processor_id() : c->apicid;
> >
> > +#ifdef CONFIG_MULTI_NODE_CPU
> > + ? ? ? if (cpu_has(c, X86_FEATURE_AMD_DCM))
> > + ? ? ? ? ? ? ? node = (c->phys_proc_id << 1) + c->cpu_node_id;
> > + ? ? ? else
> > + ? ? ? ? ? ? ? node = c->phys_proc_id;
> > +#else
> > ? ? ? ?node = c->phys_proc_id;
> > +#endif
> Stale CONFIG_MULTI_NODE_CPU?

Argh!
Thanks for catching this.

Fixed patch is attached.


Regards,
Andreas

---

This adapts CPU topology detection for AMD Magny-Cours.

Here is example output from two cores on same package but different
internal cpu_nodes:

/sys/devices/system/cpu/cpu5:
physical_package_id : 0
core_id : 5
thread_siblings : 00000020
thread_siblings_list : 5
cpu_node_id : 0
cpu_node_siblings : 0000003f
cpu_node_siblings_list : 0-5
core_siblings : 00000fff
core_siblings_list : 0-11
/sys/devices/system/cpu/cpu6:
physical_package_id : 0
core_id : 0
thread_siblings : 00000040
thread_siblings_list : 6
cpu_node_id : 1
cpu_node_siblings : 00000fc0
cpu_node_siblings_list : 6-11
core_siblings : 00000fff
core_siblings_list : 0-11

Signed-off-by: Andreas Herrmann <[email protected]>
---
arch/x86/include/asm/cpufeature.h | 1 +
arch/x86/kernel/cpu/amd.c | 60 ++++++++++++++++++++++++++++++++++++-
2 files changed, 60 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 13cc6a5..7abe596 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -94,6 +94,7 @@
#define X86_FEATURE_TSC_RELIABLE (3*32+23) /* TSC is known to be reliable */
#define X86_FEATURE_NONSTOP_TSC (3*32+24) /* TSC does not stop in C states */
#define X86_FEATURE_CLFLUSH_MONITOR (3*32+25) /* "" clflush reqd with monitor */
+#define X86_FEATURE_AMD_DCM (3*32+26) /* multi-node processor */

/* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */
#define X86_FEATURE_XMM3 (4*32+ 0) /* "pni" SSE-3 */
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 728b375..8c27925 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -6,6 +6,7 @@
#include <asm/processor.h>
#include <asm/apic.h>
#include <asm/cpu.h>
+#include <asm/pci-direct.h>

#ifdef CONFIG_X86_64
# include <asm/numa_64.h>
@@ -250,6 +251,56 @@ static int __cpuinit nearby_node(int apicid)
#endif

/*
+ * Fixup core topology information for AMD multi-node processors.
+ * Assumption 1: Number of cores in each internal node is the same.
+ * Assumption 2: Mixed systems with both single-node and dual-node
+ * processors are not supported.
+ */
+static void __cpuinit amd_fixup_dcm(struct cpuinfo_x86 *c)
+{
+ u32 t, cpn;
+ u8 n;
+
+ /* fixup topology information only once for a core */
+ if (cpu_has(c, X86_FEATURE_AMD_DCM))
+ return;
+
+ /* check for multi-node processor on boot cpu */
+ t = read_pci_config(0, 24, 3, 0xe8);
+ if (!(t & (1 << 29)))
+ return;
+
+ set_cpu_cap(c, X86_FEATURE_AMD_DCM);
+
+ /* cores per node: each internal node has half the number of cores */
+ cpn = c->x86_max_cores >> 1;
+
+ /* even-numbered NB_id of this dual-node processor */
+ n = c->phys_proc_id << 1;
+
+ /*
+ * determine internal node id and assign cores fifty-fifty to
+ * each node of the dual-node processor
+ */
+ t = read_pci_config(0, 24 + n, 3, 0xe8);
+ n = (t>>30) & 0x3;
+ if (n == 0) {
+ if (c->cpu_core_id < cpn)
+ c->cpu_node_id = 0;
+ else
+ c->cpu_node_id = 1;
+ } else {
+ if (c->cpu_core_id < cpn)
+ c->cpu_node_id = 1;
+ else
+ c->cpu_node_id = 0;
+ }
+
+ /* fixup core id to be in range from 0 to cpn */
+ c->cpu_core_id = c->cpu_core_id % cpn;
+}
+
+/*
* On a AMD dual core setup the lower bits of the APIC id distingush the cores.
* Assumes number of cores is a power of two.
*/
@@ -264,6 +315,9 @@ static void __cpuinit amd_detect_cmp(struct cpuinfo_x86 *c)
c->cpu_core_id = c->initial_apicid & ((1 << bits)-1);
/* Convert the initial APIC ID into the socket ID */
c->phys_proc_id = c->initial_apicid >> bits;
+ /* fixup topology information on multi-node processors */
+ if ((c->x86 == 0x10) && (c->x86_model == 9))
+ amd_fixup_dcm(c);
#endif
}

@@ -274,7 +328,11 @@ static void __cpuinit srat_detect_node(struct cpuinfo_x86 *c)
int node;
unsigned apicid = cpu_has_apic ? hard_smp_processor_id() : c->apicid;

- node = c->phys_proc_id;
+ if (cpu_has(c, X86_FEATURE_AMD_DCM))
+ node = (c->phys_proc_id << 1) + c->cpu_node_id;
+ else
+ node = c->phys_proc_id;
+
if (apicid_to_node[apicid] != NUMA_NO_NODE)
node = apicid_to_node[apicid];
if (!node_online(node)) {
--
1.6.3.1


2009-06-07 13:40:27

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 0/5 v3] x86: adapt CPU topology detection for AMD Magny-Cours


* Andreas Herrmann <[email protected]> wrote:

> Changes to previous patch set:
> - remove MULTI_NODE_CPU config option
> - provide defaults for cpu_node topology information
> - add patch to fix AMD mcheck code
>
> Current patch set contains 5 patches:
> - patch 1 adapts common code to show cpu_node_id,
> cpu_node_siblings and cpu_node_siblings_list in
> /sys/devices/system/cpu/cpu*/topology
> - patch 2 prepares arch/x86 to provide cpu_node information
> - patch 3 sets up cpu_node information for AMD Magny-Cours CPU
> - patch 4 fixes L3 cache information for Magny-Cours
> - patch 5 fixes mcheck code for Magny-Cours

it would be really nice to propagate this info to where it _really_
matters: the sched-domains topology info - unless i'm missing
something this patch-set does not do that yet, right? That way we'll
get actual feedback if it's broken, and will help people if it works
right. Device allocation matters too, but to a much lesser degree.

Ingo

Subject: Re: [PATCH 0/5 v3] x86: adapt CPU topology detection for AMD Magny-Cours

On Sun, Jun 07, 2009 at 03:40:09PM +0200, Ingo Molnar wrote:
>
> * Andreas Herrmann <[email protected]> wrote:
>
> > Changes to previous patch set:
> > - remove MULTI_NODE_CPU config option
> > - provide defaults for cpu_node topology information
> > - add patch to fix AMD mcheck code
> >
> > Current patch set contains 5 patches:
> > - patch 1 adapts common code to show cpu_node_id,
> > cpu_node_siblings and cpu_node_siblings_list in
> > /sys/devices/system/cpu/cpu*/topology
> > - patch 2 prepares arch/x86 to provide cpu_node information
> > - patch 3 sets up cpu_node information for AMD Magny-Cours CPU
> > - patch 4 fixes L3 cache information for Magny-Cours
> > - patch 5 fixes mcheck code for Magny-Cours
>
> it would be really nice to propagate this info to where it _really_
> matters: the sched-domains topology info - unless i'm missing
> something this patch-set does not do that yet, right?

No scheduler modifcations contained in this patch set.

> That way we'll get actual feedback if it's broken, and will help
> people if it works right. Device allocation matters too, but to a
> much lesser degree.

With and w/o this patch set scheduler is broken for Magny-Cours.

When performing

# echo 2 >> /sys/devices/system/cpu/sched_mc_power_savings

I get (both with and without above patches):

CPU23 attaching sched-domain:
domain 0: span 12-23 level MC
groups: 23 12 13 14 15 16 17 18 19 20 21 22
ERROR: parent span is not a superset of domain->span
domain 1: span 18-23 level CPU
ERROR: domain->groups does not contain CPU23
groups: 12-17 (__cpu_power = 12288)
ERROR: groups don't span domain->span
domain 2: span 0-23 level NODE
groups:
ERROR: domain->cpu_power not set

ERROR: groups don't span domain->span

Output is from dmesg -- copied just the lines form the last CPU.

I'd appreciate if you'd pull this patch set for .31. (Maybe I have
to prepare an updated version to avoid conflicts.)

I am working on the scheduler front. But don't know when
first patches will be ready for review.


Regards,
Andreas

--
Operating | Advanced Micro Devices GmbH
System | Karl-Hammerschmidt-Str. 34, 85609 Dornach b. M?nchen, Germany
Research | Gesch?ftsf?hrer: Thomas M. McCoy, Giuliano Meroni
Center | Sitz: Dornach, Gemeinde Aschheim, Landkreis M?nchen
(OSRC) | Registergericht M?nchen, HRB Nr. 43632