Subject: [PATCH] x86: Remove wrong error message in x86_default_fixup_cpu_id


It's only called from amd.c:srat_detect_node(). The introduced
condition for calling the fixup code is true for all AMD multi-node
processors, e.g. Magny-Cours and Interlagos. There we have 2 NUMA
nodes on one socket. And thus there are cores having different
numa-node-id but with equal phys_proc_id. For example on such a system
we now get

[ 0.228109] Booting Node 0, Processors #1
[ 0.232337] smpboot cpu 1: start_ip = 83000
[ 0.252088] #2
[ 0.253746] smpboot cpu 2: start_ip = 83000
[ 0.272086] #3
[ 0.276018] smpboot cpu 3: start_ip = 83000
[ 0.296088] #4
[ 0.297745] smpboot cpu 4: start_ip = 83000
[ 0.316088] #5
[ 0.320021] smpboot cpu 5: start_ip = 83000
[ 0.340113] Ok.
[ 0.342324] Booting Node 1, Processors #6
[ 0.344344] smpboot cpu 6: start_ip = 83000
[ 0.016000] NUMA core number 1 differs from configured core number 0
[ 0.372110] #7
[ 0.373771] smpboot cpu 7: start_ip = 83000
[ 0.016000] NUMA core number 1 differs from configured core number 0
[ 0.396104] #8
[ 0.397764] smpboot cpu 8: start_ip = 83000
[ 0.016000] NUMA core number 1 differs from configured core number 0
[ 0.420109] #9
[ 0.421773] smpboot cpu 9: start_ip = 83000
[ 0.016000] NUMA core number 1 differs from configured core number 0
[ 0.444113] #10
[ 0.445865] smpboot cpu 10: start_ip = 83000
[ 0.016000] NUMA core number 1 differs from configured core number 0
[ 0.468111] #11
[ 0.472030] smpboot cpu 11: start_ip = 83000
[ 0.016000] NUMA core number 1 differs from configured core number 0

These NUMA core numbering error messages are plain wrong.

The confusing/misleading error message was introduced with commit
64be4c1c2428e148de6081af235e2418e6a66dda (x86: Add x86_init platform
override to fix up NUMA core numbering) and should be removed.

Reported-by: Borislav Petkov <[email protected]>
Signed-off-by: Andreas Herrmann <[email protected]>
---
arch/x86/kernel/cpu/common.c | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)

BTW, I wonder why the fixup code isn't called from the Intel path. At
least the mentioned patch suggests that something more generic was
introduced here.

And I am curious why there is this specific condition to decide
whether a call to x86_cpuinit.fixup_cpu_id is required.


Regards,
Andreas


diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index d43cad7..2ef7685 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1163,7 +1163,6 @@ static void dbg_restore_debug_regs(void)
*/
void __cpuinit x86_default_fixup_cpu_id(struct cpuinfo_x86 *c, int node)
{
- pr_err("NUMA core number %d differs from configured core number %d\n", node, c->phys_proc_id);
}

/*
--
1.7.8.4



2012-02-21 10:27:37

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] x86: Remove wrong error message in x86_default_fixup_cpu_id

On Mon, Feb 20, 2012 at 06:17:05PM +0100, Andreas Herrmann wrote:
>
> It's only called from amd.c:srat_detect_node(). The introduced
> condition for calling the fixup code is true for all AMD multi-node
> processors, e.g. Magny-Cours and Interlagos. There we have 2 NUMA
> nodes on one socket. And thus there are cores having different
> numa-node-id but with equal phys_proc_id. For example on such a system
> we now get
>
> [ 0.228109] Booting Node 0, Processors #1
> [ 0.232337] smpboot cpu 1: start_ip = 83000
> [ 0.252088] #2
> [ 0.253746] smpboot cpu 2: start_ip = 83000
> [ 0.272086] #3
> [ 0.276018] smpboot cpu 3: start_ip = 83000
> [ 0.296088] #4
> [ 0.297745] smpboot cpu 4: start_ip = 83000
> [ 0.316088] #5
> [ 0.320021] smpboot cpu 5: start_ip = 83000
> [ 0.340113] Ok.
> [ 0.342324] Booting Node 1, Processors #6
> [ 0.344344] smpboot cpu 6: start_ip = 83000
> [ 0.016000] NUMA core number 1 differs from configured core number 0
> [ 0.372110] #7
> [ 0.373771] smpboot cpu 7: start_ip = 83000
> [ 0.016000] NUMA core number 1 differs from configured core number 0
> [ 0.396104] #8
> [ 0.397764] smpboot cpu 8: start_ip = 83000
> [ 0.016000] NUMA core number 1 differs from configured core number 0
> [ 0.420109] #9
> [ 0.421773] smpboot cpu 9: start_ip = 83000
> [ 0.016000] NUMA core number 1 differs from configured core number 0
> [ 0.444113] #10
> [ 0.445865] smpboot cpu 10: start_ip = 83000
> [ 0.016000] NUMA core number 1 differs from configured core number 0
> [ 0.468111] #11
> [ 0.472030] smpboot cpu 11: start_ip = 83000
> [ 0.016000] NUMA core number 1 differs from configured core number 0
>
> These NUMA core numbering error messages are plain wrong.
>
> The confusing/misleading error message was introduced with commit
> 64be4c1c2428e148de6081af235e2418e6a66dda (x86: Add x86_init platform
> override to fix up NUMA core numbering) and should be removed.
>
> Reported-by: Borislav Petkov <[email protected]>
> Signed-off-by: Andreas Herrmann <[email protected]>
> ---
> arch/x86/kernel/cpu/common.c | 1 -
> 1 files changed, 0 insertions(+), 1 deletions(-)
>
> BTW, I wonder why the fixup code isn't called from the Intel path. At
> least the mentioned patch suggests that something more generic was
> introduced here.

Right, and I would remove the check in amd.c:srat_detect_node() instead
of removing the printk statement in the default implementation.

IOW, we need more info on why the check was added only to the AMD path?
Daniel?

Thanks.

--
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

2012-02-21 11:05:27

by Daniel J Blueman

[permalink] [raw]
Subject: Re: [PATCH] x86: Remove wrong error message in x86_default_fixup_cpu_id

On 21/02/2012 10:27, Borislav Petkov wrote:
> On Mon, Feb 20, 2012 at 06:17:05PM +0100, Andreas Herrmann wrote:
>
> It's only called from amd.c:srat_detect_node(). The introduced
> condition for calling the fixup code is true for all AMD multi-node
> processors, e.g. Magny-Cours and Interlagos. There we have 2 NUMA
> nodes on one socket. And thus there are cores having different
> numa-node-id but with equal phys_proc_id. For example on such a system
> we now get
>
> [ 0.228109] Booting Node 0, Processors #1
> [ 0.232337] smpboot cpu 1: start_ip = 83000
> [ 0.252088] #2
> [ 0.253746] smpboot cpu 2: start_ip = 83000
> [ 0.272086] #3
> [ 0.276018] smpboot cpu 3: start_ip = 83000
> [ 0.296088] #4
> [ 0.297745] smpboot cpu 4: start_ip = 83000
> [ 0.316088] #5
> [ 0.320021] smpboot cpu 5: start_ip = 83000
> [ 0.340113] Ok.
> [ 0.342324] Booting Node 1, Processors #6
> [ 0.344344] smpboot cpu 6: start_ip = 83000
> [ 0.016000] NUMA core number 1 differs from configured core number 0
> [ 0.372110] #7
> [ 0.373771] smpboot cpu 7: start_ip = 83000
> [ 0.016000] NUMA core number 1 differs from configured core number 0
> [ 0.396104] #8
> [ 0.397764] smpboot cpu 8: start_ip = 83000
> [ 0.016000] NUMA core number 1 differs from configured core number 0
> [ 0.420109] #9
> [ 0.421773] smpboot cpu 9: start_ip = 83000
> [ 0.016000] NUMA core number 1 differs from configured core number 0
> [ 0.444113] #10
> [ 0.445865] smpboot cpu 10: start_ip = 83000
> [ 0.016000] NUMA core number 1 differs from configured core number 0
> [ 0.468111] #11
> [ 0.472030] smpboot cpu 11: start_ip = 83000
> [ 0.016000] NUMA core number 1 differs from configured core number 0
>
> These NUMA core numbering error messages are plain wrong.
>
> The confusing/misleading error message was introduced with commit
> 64be4c1c2428e148de6081af235e2418e6a66dda (x86: Add x86_init platform
> override to fix up NUMA core numbering) and should be removed.
>
> Reported-by: Borislav Petkov<[email protected]>
> Signed-off-by: Andreas Herrmann<[email protected]>
> ---
> arch/x86/kernel/cpu/common.c | 1 -
> 1 files changed, 0 insertions(+), 1 deletions(-)
>
> BTW, I wonder why the fixup code isn't called from the Intel path. At
> least the mentioned patch suggests that something more generic was
> introduced here.
> Right, and I would remove the check in amd.c:srat_detect_node() instead
> of removing the printk statement in the default implementation.
>
> IOW, we need more info on why the check was added only to the AMD path?
> Daniel?

The check and fixup wasn't needed in the Intel path thus far, so wasn't
added.

We could specialise the 'if (c->phys_proc_id != node)' test to check for
x86_cpuinit.fixup_cpu_id being NULL and drop the default override, if
that is preferred?

Thanks,
Daniel

--
Daniel J Blueman
Principal Software Engineer, Numascale Asia

2012-02-21 11:20:54

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] x86: Remove wrong error message in x86_default_fixup_cpu_id

On Tue, Feb 21, 2012 at 11:05:21AM +0000, Daniel J Blueman wrote:
> The check and fixup wasn't needed in the Intel path thus far, so
> wasn't added.
>
> We could specialise the 'if (c->phys_proc_id != node)' test to check
> for x86_cpuinit.fixup_cpu_id being NULL and drop the default
> override, if that is preferred?

Before that, why do you need that check in the AMD path at all? Please
give a more detailed explanation as to why is it needed on the AMD path
at all.

Thanks.

--
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

2012-02-21 12:56:53

by Daniel J Blueman

[permalink] [raw]
Subject: Re: [PATCH] x86: Remove wrong error message in x86_default_fixup_cpu_id

On 21/02/2012 11:20, Borislav Petkov wrote:
> On Tue, Feb 21, 2012 at 11:05:21AM +0000, Daniel J Blueman wrote:
>> The check and fixup wasn't needed in the Intel path thus far, so
>> wasn't added.
>>
>> We could specialise the 'if (c->phys_proc_id != node)' test to check
>> for x86_cpuinit.fixup_cpu_id being NULL and drop the default
>> override, if that is preferred?
> Before that, why do you need that check in the AMD path at all? Please
> give a more detailed explanation as to why is it needed on the AMD path
> at all.

Since Numascale's NumaConnect bridges multiple separate HyperTransport
fabrics across multiple servers together, the HT IDs written in the
hardware thus don't match the information in the SRAT table constructed
in the bootloader, thus we need to set this to the logical value [1,
'fixup_cpu_id'].

[1] https://lkml.org/lkml/2011/12/5/292

--
Daniel J Blueman
Principal Software Engineer, Numascale Asia

Subject: Re: [PATCH] x86: Remove wrong error message in x86_default_fixup_cpu_id

On Tue, Feb 21, 2012 at 11:05:21AM +0000, Daniel J Blueman wrote:
> On 21/02/2012 10:27, Borislav Petkov wrote:
> > On Mon, Feb 20, 2012 at 06:17:05PM +0100, Andreas Herrmann wrote:

[snip]

> > BTW, I wonder why the fixup code isn't called from the Intel path. At
> > least the mentioned patch suggests that something more generic was
> > introduced here.
> > Right, and I would remove the check in amd.c:srat_detect_node() instead
> > of removing the printk statement in the default implementation.
> >
> > IOW, we need more info on why the check was added only to the AMD path?
> > Daniel?
>
> The check and fixup wasn't needed in the Intel path thus far, so wasn't
> added.
>
> We could specialise the 'if (c->phys_proc_id != node)' test to check for
> x86_cpuinit.fixup_cpu_id being NULL and drop the default override, if
> that is preferred?

It seems that all the stuff in x86_init.[ch] is using default/noop
functions instead of NULL pointer checks. So we shouldn't deviate from
this for x86_cpuinit.fixup_cpu_id.

I think attached patch is more suitable to avoid the wrong warning
message.

Please review.


Thanks,

Andreas

--
x86: Remove wrong error message in x86_default_fixup_cpu_id

It's only called from amd.c:srat_detect_node(). The introduced
condition for calling the fixup code is true for all AMD multi-node
processors, e.g. Magny-Cours and Interlagos. There we have 2 NUMA
nodes on one socket. Thus there are cores having different
numa-node-id but with equal phys_proc_id.

There is no point to print error messages in such a situation.

The confusing/misleading error message was introduced with commit
64be4c1c2428e148de6081af235e2418e6a66dda (x86: Add x86_init platform
override to fix up NUMA core numbering).

Change the default fixup function (remove the error message), move the
Numascale-specific condition for calling the fixup into the
fixup-function itself and slightly adapt the comment.

Signed-off-by: Andreas Herrmann <[email protected]>
---
arch/x86/include/asm/x86_init.h | 2 +-
arch/x86/kernel/apic/apic_numachip.c | 7 +++++--
arch/x86/kernel/cpu/amd.c | 8 ++++----
arch/x86/kernel/cpu/common.c | 9 ---------
arch/x86/kernel/x86_init.c | 1 +
5 files changed, 11 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index 517d476..1bcacef 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -189,6 +189,6 @@ extern struct x86_msi_ops x86_msi;

extern void x86_init_noop(void);
extern void x86_init_uint_noop(unsigned int unused);
-extern void x86_default_fixup_cpu_id(struct cpuinfo_x86 *c, int node);
+extern void x86_default_fixup_cpu_id(struct cpuinfo_x86 *c, int n);

#endif
diff --git a/arch/x86/kernel/apic/apic_numachip.c b/arch/x86/kernel/apic/apic_numachip.c
index 09d3d8c..ade0182 100644
--- a/arch/x86/kernel/apic/apic_numachip.c
+++ b/arch/x86/kernel/apic/apic_numachip.c
@@ -201,8 +201,11 @@ static void __init map_csrs(void)

static void fixup_cpu_id(struct cpuinfo_x86 *c, int node)
{
- c->phys_proc_id = node;
- per_cpu(cpu_llc_id, smp_processor_id()) = node;
+
+ if (c->phys_proc_id != node) {
+ c->phys_proc_id = node;
+ per_cpu(cpu_llc_id, smp_processor_id()) = node;
+ }
}

static int __init numachip_system_init(void)
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index f4773f4..52b7287 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -352,11 +352,11 @@ static void __cpuinit srat_detect_node(struct cpuinfo_x86 *c)
node = per_cpu(cpu_llc_id, cpu);

/*
- * If core numbers are inconsistent, it's likely a multi-fabric platform,
- * so invoke platform-specific handler
+ * On multi-fabric platform (e.g. Numascale NumaChip) a
+ * platform-specific handler needs to be called to fixup some
+ * IDs of the CPU.
*/
- if (c->phys_proc_id != node)
- x86_cpuinit.fixup_cpu_id(c, node);
+ x86_cpuinit.fixup_cpu_id(c, node);

if (!node_online(node)) {
/*
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index d43cad7..37da7a6 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1158,15 +1158,6 @@ static void dbg_restore_debug_regs(void)
#endif /* ! CONFIG_KGDB */

/*
- * Prints an error where the NUMA and configured core-number mismatch and the
- * platform didn't override this to fix it up
- */
-void __cpuinit x86_default_fixup_cpu_id(struct cpuinfo_x86 *c, int node)
-{
- pr_err("NUMA core number %d differs from configured core number %d\n", node, c->phys_proc_id);
-}
-
-/*
* cpu_init() initializes state that is per-CPU. Some data is already
* initialized (naturally) in the bootstrap process, such as the GDT
* and IDT. We reload them nevertheless, this function acts as a
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index 947a06c..67cf78a 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -90,6 +90,7 @@ struct x86_init_ops x86_init __initdata = {
},
};

+void __cpuinit x86_default_fixup_cpu_id(struct cpuinfo_x86 *c, int n) { }
struct x86_cpuinit_ops x86_cpuinit __cpuinitdata = {
.setup_percpu_clockev = setup_secondary_APIC_clock,
.fixup_cpu_id = x86_default_fixup_cpu_id,
--
1.7.8.4


2012-02-23 10:23:13

by Daniel J Blueman

[permalink] [raw]
Subject: Re: [PATCH] x86: Remove wrong error message in x86_default_fixup_cpu_id

On 22/02/2012 13:47, Andreas Herrmann wrote:
> On Tue, Feb 21, 2012 at 11:05:21AM +0000, Daniel J Blueman wrote:
>> On 21/02/2012 10:27, Borislav Petkov wrote:
>>> On Mon, Feb 20, 2012 at 06:17:05PM +0100, Andreas Herrmann wrote:
> [snip]
>
>>> BTW, I wonder why the fixup code isn't called from the Intel path. At
>>> least the mentioned patch suggests that something more generic was
>>> introduced here.
>>> Right, and I would remove the check in amd.c:srat_detect_node() instead
>>> of removing the printk statement in the default implementation.
>>>
>>> IOW, we need more info on why the check was added only to the AMD path?
>>> Daniel?
>> The check and fixup wasn't needed in the Intel path thus far, so wasn't
>> added.
>>
>> We could specialise the 'if (c->phys_proc_id != node)' test to check for
>> x86_cpuinit.fixup_cpu_id being NULL and drop the default override, if
>> that is preferred?
> It seems that all the stuff in x86_init.[ch] is using default/noop
> functions instead of NULL pointer checks. So we shouldn't deviate from
> this for x86_cpuinit.fixup_cpu_id.
>
> I think attached patch is more suitable to avoid the wrong warning
> message.
>
> Please review.

Yes, this looks reasonable and tests out successfully on systems with
and without NumaConnect.

Signed-off-by: Daniel J Blueman <[email protected]>

Thanks,
Daniel

>
>
> Thanks,
>
> Andreas
>
> --
> x86: Remove wrong error message in x86_default_fixup_cpu_id
>
> It's only called from amd.c:srat_detect_node(). The introduced
> condition for calling the fixup code is true for all AMD multi-node
> processors, e.g. Magny-Cours and Interlagos. There we have 2 NUMA
> nodes on one socket. Thus there are cores having different
> numa-node-id but with equal phys_proc_id.
>
> There is no point to print error messages in such a situation.
>
> The confusing/misleading error message was introduced with commit
> 64be4c1c2428e148de6081af235e2418e6a66dda (x86: Add x86_init platform
> override to fix up NUMA core numbering).
>
> Change the default fixup function (remove the error message), move the
> Numascale-specific condition for calling the fixup into the
> fixup-function itself and slightly adapt the comment.
>
> Signed-off-by: Andreas Herrmann<[email protected]>
> ---
> arch/x86/include/asm/x86_init.h | 2 +-
> arch/x86/kernel/apic/apic_numachip.c | 7 +++++--
> arch/x86/kernel/cpu/amd.c | 8 ++++----
> arch/x86/kernel/cpu/common.c | 9 ---------
> arch/x86/kernel/x86_init.c | 1 +
> 5 files changed, 11 insertions(+), 16 deletions(-)
>
> diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
> index 517d476..1bcacef 100644
> --- a/arch/x86/include/asm/x86_init.h
> +++ b/arch/x86/include/asm/x86_init.h
> @@ -189,6 +189,6 @@ extern struct x86_msi_ops x86_msi;
>
> extern void x86_init_noop(void);
> extern void x86_init_uint_noop(unsigned int unused);
> -extern void x86_default_fixup_cpu_id(struct cpuinfo_x86 *c, int node);
> +extern void x86_default_fixup_cpu_id(struct cpuinfo_x86 *c, int n);
>
> #endif
> diff --git a/arch/x86/kernel/apic/apic_numachip.c b/arch/x86/kernel/apic/apic_numachip.c
> index 09d3d8c..ade0182 100644
> --- a/arch/x86/kernel/apic/apic_numachip.c
> +++ b/arch/x86/kernel/apic/apic_numachip.c
> @@ -201,8 +201,11 @@ static void __init map_csrs(void)
>
> static void fixup_cpu_id(struct cpuinfo_x86 *c, int node)
> {
> - c->phys_proc_id = node;
> - per_cpu(cpu_llc_id, smp_processor_id()) = node;
> +
> + if (c->phys_proc_id != node) {
> + c->phys_proc_id = node;
> + per_cpu(cpu_llc_id, smp_processor_id()) = node;
> + }
> }
>
> static int __init numachip_system_init(void)
> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index f4773f4..52b7287 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -352,11 +352,11 @@ static void __cpuinit srat_detect_node(struct cpuinfo_x86 *c)
> node = per_cpu(cpu_llc_id, cpu);
>
> /*
> - * If core numbers are inconsistent, it's likely a multi-fabric platform,
> - * so invoke platform-specific handler
> + * On multi-fabric platform (e.g. Numascale NumaChip) a
> + * platform-specific handler needs to be called to fixup some
> + * IDs of the CPU.
> */
> - if (c->phys_proc_id != node)
> - x86_cpuinit.fixup_cpu_id(c, node);
> + x86_cpuinit.fixup_cpu_id(c, node);
>
> if (!node_online(node)) {
> /*
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index d43cad7..37da7a6 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -1158,15 +1158,6 @@ static void dbg_restore_debug_regs(void)
> #endif /* ! CONFIG_KGDB */
>
> /*
> - * Prints an error where the NUMA and configured core-number mismatch and the
> - * platform didn't override this to fix it up
> - */
> -void __cpuinit x86_default_fixup_cpu_id(struct cpuinfo_x86 *c, int node)
> -{
> - pr_err("NUMA core number %d differs from configured core number %d\n", node, c->phys_proc_id);
> -}
> -
> -/*
> * cpu_init() initializes state that is per-CPU. Some data is already
> * initialized (naturally) in the bootstrap process, such as the GDT
> * and IDT. We reload them nevertheless, this function acts as a
> diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
> index 947a06c..67cf78a 100644
> --- a/arch/x86/kernel/x86_init.c
> +++ b/arch/x86/kernel/x86_init.c
> @@ -90,6 +90,7 @@ struct x86_init_ops x86_init __initdata = {
> },
> };
>
> +void __cpuinit x86_default_fixup_cpu_id(struct cpuinfo_x86 *c, int n) { }
> struct x86_cpuinit_ops x86_cpuinit __cpuinitdata = {
> .setup_percpu_clockev = setup_secondary_APIC_clock,
> .fixup_cpu_id = x86_default_fixup_cpu_id,
--
Daniel J Blueman
Principal Software Engineer, Numascale Asia

Subject: [PATCH resend] x86: Remove wrong error message in x86_default_fixup_cpu_id


It's only called from amd.c:srat_detect_node(). The introduced
condition for calling the fixup code is true for all AMD multi-node
processors, e.g. Magny-Cours and Interlagos. There we have 2 NUMA
nodes on one socket. Thus there are cores having different
numa-node-id but with equal phys_proc_id.

There is no point to print error messages in such a situation.

The confusing/misleading error message was introduced with commit
64be4c1c2428e148de6081af235e2418e6a66dda (x86: Add x86_init platform
override to fix up NUMA core numbering).

Change the default fixup function (remove the error message), move the
Numascale-specific condition for calling the fixup into the
fixup-function itself and slightly adapt the comment.

Signed-off-by: Andreas Herrmann <[email protected]>
Reviewed-by: Daniel J Blueman <[email protected]>
---
arch/x86/include/asm/x86_init.h | 2 +-
arch/x86/kernel/apic/apic_numachip.c | 7 +++++--
arch/x86/kernel/cpu/amd.c | 8 ++++----
arch/x86/kernel/cpu/common.c | 9 ---------
arch/x86/kernel/x86_init.c | 1 +
5 files changed, 11 insertions(+), 16 deletions(-)


Please apply.


Thanks,

Andreas


diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index 517d476..1bcacef 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -189,6 +189,6 @@ extern struct x86_msi_ops x86_msi;

extern void x86_init_noop(void);
extern void x86_init_uint_noop(unsigned int unused);
-extern void x86_default_fixup_cpu_id(struct cpuinfo_x86 *c, int node);
+extern void x86_default_fixup_cpu_id(struct cpuinfo_x86 *c, int n);

#endif
diff --git a/arch/x86/kernel/apic/apic_numachip.c b/arch/x86/kernel/apic/apic_numachip.c
index 09d3d8c..ade0182 100644
--- a/arch/x86/kernel/apic/apic_numachip.c
+++ b/arch/x86/kernel/apic/apic_numachip.c
@@ -201,8 +201,11 @@ static void __init map_csrs(void)

static void fixup_cpu_id(struct cpuinfo_x86 *c, int node)
{
- c->phys_proc_id = node;
- per_cpu(cpu_llc_id, smp_processor_id()) = node;
+
+ if (c->phys_proc_id != node) {
+ c->phys_proc_id = node;
+ per_cpu(cpu_llc_id, smp_processor_id()) = node;
+ }
}

static int __init numachip_system_init(void)
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index f4773f4..52b7287 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -352,11 +352,11 @@ static void __cpuinit srat_detect_node(struct cpuinfo_x86 *c)
node = per_cpu(cpu_llc_id, cpu);

/*
- * If core numbers are inconsistent, it's likely a multi-fabric platform,
- * so invoke platform-specific handler
+ * On multi-fabric platform (e.g. Numascale NumaChip) a
+ * platform-specific handler needs to be called to fixup some
+ * IDs of the CPU.
*/
- if (c->phys_proc_id != node)
- x86_cpuinit.fixup_cpu_id(c, node);
+ x86_cpuinit.fixup_cpu_id(c, node);

if (!node_online(node)) {
/*
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index d43cad7..37da7a6 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1158,15 +1158,6 @@ static void dbg_restore_debug_regs(void)
#endif /* ! CONFIG_KGDB */

/*
- * Prints an error where the NUMA and configured core-number mismatch and the
- * platform didn't override this to fix it up
- */
-void __cpuinit x86_default_fixup_cpu_id(struct cpuinfo_x86 *c, int node)
-{
- pr_err("NUMA core number %d differs from configured core number %d\n", node, c->phys_proc_id);
-}
-
-/*
* cpu_init() initializes state that is per-CPU. Some data is already
* initialized (naturally) in the bootstrap process, such as the GDT
* and IDT. We reload them nevertheless, this function acts as a
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index 947a06c..67cf78a 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -90,6 +90,7 @@ struct x86_init_ops x86_init __initdata = {
},
};

+void __cpuinit x86_default_fixup_cpu_id(struct cpuinfo_x86 *c, int n) { }
struct x86_cpuinit_ops x86_cpuinit __cpuinitdata = {
.setup_percpu_clockev = setup_secondary_APIC_clock,
.fixup_cpu_id = x86_default_fixup_cpu_id,
--
1.7.8.4


Subject: [tip:x86/platform] x86/platform: Remove incorrect error message in x86_default_fixup_cpu_id()

Commit-ID: d7dedbe3694bbf97b4e3409e1fa52d97a3b9f028
Gitweb: http://git.kernel.org/tip/d7dedbe3694bbf97b4e3409e1fa52d97a3b9f028
Author: Andreas Herrmann <[email protected]>
AuthorDate: Fri, 24 Feb 2012 16:31:27 +0100
Committer: Ingo Molnar <[email protected]>
CommitDate: Mon, 27 Feb 2012 10:35:47 +0100

x86/platform: Remove incorrect error message in x86_default_fixup_cpu_id()

It's only called from amd.c:srat_detect_node(). The introduced
condition for calling the fixup code is true for all AMD
multi-node processors, e.g. Magny-Cours and Interlagos. There we
have 2 NUMA nodes on one socket. Thus there are cores having
different numa-node-id but with equal phys_proc_id.

There is no point to print error messages in such a situation.

The confusing/misleading error message was introduced with
commit 64be4c1c2428e148de6081af235e2418e6a66dda (x86: Add
x86_init platform override to fix up NUMA core numbering).

Change the default fixup function (remove the error message),
move the Numascale-specific condition for calling the fixup into
the fixup-function itself and slightly adapt the comment.

Signed-off-by: Andreas Herrmann <[email protected]>
Reviewed-by: Daniel J Blueman <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Steffen Persvold <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/x86_init.h | 2 +-
arch/x86/kernel/apic/apic_numachip.c | 7 +++++--
arch/x86/kernel/cpu/amd.c | 8 ++++----
arch/x86/kernel/cpu/common.c | 9 ---------
arch/x86/kernel/x86_init.c | 1 +
5 files changed, 11 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index 517d476..1bcacef 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -189,6 +189,6 @@ extern struct x86_msi_ops x86_msi;

extern void x86_init_noop(void);
extern void x86_init_uint_noop(unsigned int unused);
-extern void x86_default_fixup_cpu_id(struct cpuinfo_x86 *c, int node);
+extern void x86_default_fixup_cpu_id(struct cpuinfo_x86 *c, int n);

#endif
diff --git a/arch/x86/kernel/apic/apic_numachip.c b/arch/x86/kernel/apic/apic_numachip.c
index 09d3d8c..ade0182 100644
--- a/arch/x86/kernel/apic/apic_numachip.c
+++ b/arch/x86/kernel/apic/apic_numachip.c
@@ -201,8 +201,11 @@ static void __init map_csrs(void)

static void fixup_cpu_id(struct cpuinfo_x86 *c, int node)
{
- c->phys_proc_id = node;
- per_cpu(cpu_llc_id, smp_processor_id()) = node;
+
+ if (c->phys_proc_id != node) {
+ c->phys_proc_id = node;
+ per_cpu(cpu_llc_id, smp_processor_id()) = node;
+ }
}

static int __init numachip_system_init(void)
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index f4773f4..52b7287 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -352,11 +352,11 @@ static void __cpuinit srat_detect_node(struct cpuinfo_x86 *c)
node = per_cpu(cpu_llc_id, cpu);

/*
- * If core numbers are inconsistent, it's likely a multi-fabric platform,
- * so invoke platform-specific handler
+ * On multi-fabric platform (e.g. Numascale NumaChip) a
+ * platform-specific handler needs to be called to fixup some
+ * IDs of the CPU.
*/
- if (c->phys_proc_id != node)
- x86_cpuinit.fixup_cpu_id(c, node);
+ x86_cpuinit.fixup_cpu_id(c, node);

if (!node_online(node)) {
/*
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index d43cad7..37da7a6 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1158,15 +1158,6 @@ static void dbg_restore_debug_regs(void)
#endif /* ! CONFIG_KGDB */

/*
- * Prints an error where the NUMA and configured core-number mismatch and the
- * platform didn't override this to fix it up
- */
-void __cpuinit x86_default_fixup_cpu_id(struct cpuinfo_x86 *c, int node)
-{
- pr_err("NUMA core number %d differs from configured core number %d\n", node, c->phys_proc_id);
-}
-
-/*
* cpu_init() initializes state that is per-CPU. Some data is already
* initialized (naturally) in the bootstrap process, such as the GDT
* and IDT. We reload them nevertheless, this function acts as a
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index 947a06c..67cf78a 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -90,6 +90,7 @@ struct x86_init_ops x86_init __initdata = {
},
};

+void __cpuinit x86_default_fixup_cpu_id(struct cpuinfo_x86 *c, int n) { }
struct x86_cpuinit_ops x86_cpuinit __cpuinitdata = {
.setup_percpu_clockev = setup_secondary_APIC_clock,
.fixup_cpu_id = x86_default_fixup_cpu_id,

2012-02-28 15:28:10

by Borislav Petkov

[permalink] [raw]
Subject: Re: [tip:x86/platform] x86/platform: Remove incorrect error message in x86_default_fixup_cpu_id()

On Mon, Feb 27, 2012 at 04:07:33AM -0800, tip-bot for Andreas Herrmann wrote:
> Commit-ID: d7dedbe3694bbf97b4e3409e1fa52d97a3b9f028
> Gitweb: http://git.kernel.org/tip/d7dedbe3694bbf97b4e3409e1fa52d97a3b9f028
> Author: Andreas Herrmann <[email protected]>
> AuthorDate: Fri, 24 Feb 2012 16:31:27 +0100
> Committer: Ingo Molnar <[email protected]>
> CommitDate: Mon, 27 Feb 2012 10:35:47 +0100
>
> x86/platform: Remove incorrect error message in x86_default_fixup_cpu_id()
>
> It's only called from amd.c:srat_detect_node(). The introduced
> condition for calling the fixup code is true for all AMD
> multi-node processors, e.g. Magny-Cours and Interlagos. There we
> have 2 NUMA nodes on one socket. Thus there are cores having
> different numa-node-id but with equal phys_proc_id.
>
> There is no point to print error messages in such a situation.
>
> The confusing/misleading error message was introduced with
> commit 64be4c1c2428e148de6081af235e2418e6a66dda (x86: Add
> x86_init platform override to fix up NUMA core numbering).
>
> Change the default fixup function (remove the error message),
> move the Numascale-specific condition for calling the fixup into
> the fixup-function itself and slightly adapt the comment.
>
> Signed-off-by: Andreas Herrmann <[email protected]>
> Reviewed-by: Daniel J Blueman <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Steffen Persvold <[email protected]>
> Link: http://lkml.kernel.org/r/[email protected]
> Signed-off-by: Ingo Molnar <[email protected]>

Maybe this one should be queued for x86/urgent instead and should go to
Linus soonish since it cures an issue which came in during the 3.3 merge
window with 64be4c1c2428e148de6081af235e2418e6a66dda?

Thanks.

--
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

Subject: [tip:x86/urgent] x86/platform: Remove incorrect error message in x86_default_fixup_cpu_id()

Commit-ID: 20b25f03fbd17b4e32b786024a5f9c6306795411
Gitweb: http://git.kernel.org/tip/20b25f03fbd17b4e32b786024a5f9c6306795411
Author: Andreas Herrmann <[email protected]>
AuthorDate: Fri, 24 Feb 2012 16:31:27 +0100
Committer: Ingo Molnar <[email protected]>
CommitDate: Tue, 28 Feb 2012 17:30:00 +0100

x86/platform: Remove incorrect error message in x86_default_fixup_cpu_id()

It's only called from amd.c:srat_detect_node(). The introduced
condition for calling the fixup code is true for all AMD
multi-node processors, e.g. Magny-Cours and Interlagos. There we
have 2 NUMA nodes on one socket. Thus there are cores having
different numa-node-id but with equal phys_proc_id.

There is no point to print error messages in such a situation.

The confusing/misleading error message was introduced with
commit 64be4c1c2428e148de6081af235e2418e6a66dda (x86: Add
x86_init platform override to fix up NUMA core numbering).

Change the default fixup function (remove the error message),
move the Numascale-specific condition for calling the fixup into
the fixup-function itself and slightly adapt the comment.

Signed-off-by: Andreas Herrmann <[email protected]>
Reviewed-by: Daniel J Blueman <[email protected]>
Acked-by: Borislav Petkov <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Steffen Persvold <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/x86_init.h | 2 +-
arch/x86/kernel/apic/apic_numachip.c | 7 +++++--
arch/x86/kernel/cpu/amd.c | 8 ++++----
arch/x86/kernel/cpu/common.c | 9 ---------
arch/x86/kernel/x86_init.c | 1 +
5 files changed, 11 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index 517d476..1bcacef 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -189,6 +189,6 @@ extern struct x86_msi_ops x86_msi;

extern void x86_init_noop(void);
extern void x86_init_uint_noop(unsigned int unused);
-extern void x86_default_fixup_cpu_id(struct cpuinfo_x86 *c, int node);
+extern void x86_default_fixup_cpu_id(struct cpuinfo_x86 *c, int n);

#endif
diff --git a/arch/x86/kernel/apic/apic_numachip.c b/arch/x86/kernel/apic/apic_numachip.c
index 09d3d8c..ade0182 100644
--- a/arch/x86/kernel/apic/apic_numachip.c
+++ b/arch/x86/kernel/apic/apic_numachip.c
@@ -201,8 +201,11 @@ static void __init map_csrs(void)

static void fixup_cpu_id(struct cpuinfo_x86 *c, int node)
{
- c->phys_proc_id = node;
- per_cpu(cpu_llc_id, smp_processor_id()) = node;
+
+ if (c->phys_proc_id != node) {
+ c->phys_proc_id = node;
+ per_cpu(cpu_llc_id, smp_processor_id()) = node;
+ }
}

static int __init numachip_system_init(void)
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index f4773f4..52b7287 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -352,11 +352,11 @@ static void __cpuinit srat_detect_node(struct cpuinfo_x86 *c)
node = per_cpu(cpu_llc_id, cpu);

/*
- * If core numbers are inconsistent, it's likely a multi-fabric platform,
- * so invoke platform-specific handler
+ * On multi-fabric platform (e.g. Numascale NumaChip) a
+ * platform-specific handler needs to be called to fixup some
+ * IDs of the CPU.
*/
- if (c->phys_proc_id != node)
- x86_cpuinit.fixup_cpu_id(c, node);
+ x86_cpuinit.fixup_cpu_id(c, node);

if (!node_online(node)) {
/*
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index c0f7d68..1a810e4 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1163,15 +1163,6 @@ static void dbg_restore_debug_regs(void)
#endif /* ! CONFIG_KGDB */

/*
- * Prints an error where the NUMA and configured core-number mismatch and the
- * platform didn't override this to fix it up
- */
-void __cpuinit x86_default_fixup_cpu_id(struct cpuinfo_x86 *c, int node)
-{
- pr_err("NUMA core number %d differs from configured core number %d\n", node, c->phys_proc_id);
-}
-
-/*
* cpu_init() initializes state that is per-CPU. Some data is already
* initialized (naturally) in the bootstrap process, such as the GDT
* and IDT. We reload them nevertheless, this function acts as a
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index 947a06c..67cf78a 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -90,6 +90,7 @@ struct x86_init_ops x86_init __initdata = {
},
};

+void __cpuinit x86_default_fixup_cpu_id(struct cpuinfo_x86 *c, int n) { }
struct x86_cpuinit_ops x86_cpuinit __cpuinitdata = {
.setup_percpu_clockev = setup_secondary_APIC_clock,
.fixup_cpu_id = x86_default_fixup_cpu_id,