2012-02-02 19:10:25

by Borislav Petkov

[permalink] [raw]
Subject: [PATCH] MCE, AMD: Select SMP explicitly

Randy Dunlap reported that building randconfigs which do not select
CONFIG_SMP cause the following link error:

mce_amd.c:(.cpuinit.text+0x4723): undefined reference to `cpu_llc_shared_map'

Fix it.

Acked-by: Randy Dunlap <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/Kconfig | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 5bed94e..63fcb8a 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -897,7 +897,7 @@ config X86_MCE_INTEL
config X86_MCE_AMD
def_bool y
prompt "AMD MCE features"
- depends on X86_MCE && X86_LOCAL_APIC
+ depends on X86_MCE && X86_LOCAL_APIC && SMP
---help---
Additional support for AMD specific MCE features such as
the DRAM Error Threshold.
--
1.7.9


2012-02-02 19:37:08

by Nick Bowler

[permalink] [raw]
Subject: Re: [PATCH] MCE, AMD: Select SMP explicitly

Hi,

On 2012-02-02 20:10 +0100, Borislav Petkov wrote:
> Randy Dunlap reported that building randconfigs which do not select
> CONFIG_SMP cause the following link error:
>
> mce_amd.c:(.cpuinit.text+0x4723): undefined reference to `cpu_llc_shared_map'
>
> Fix it.
>
> Acked-by: Randy Dunlap <[email protected]>
> Link: http://lkml.kernel.org/r/[email protected]
> Signed-off-by: Borislav Petkov <[email protected]>
[...]
> config X86_MCE_AMD
> def_bool y
> prompt "AMD MCE features"
> - depends on X86_MCE && X86_LOCAL_APIC
> + depends on X86_MCE && X86_LOCAL_APIC && SMP

Is this feature truly irrelevant on UP systems? I ask because I've
always enabled this option on my UP AMD systems in the past...

Cheers,
--
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)

2012-02-02 20:24:33

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] MCE, AMD: Select SMP explicitly

On Thu, Feb 02, 2012 at 02:37:02PM -0500, Nick Bowler wrote:
> Hi,
>
> On 2012-02-02 20:10 +0100, Borislav Petkov wrote:
> > Randy Dunlap reported that building randconfigs which do not select
> > CONFIG_SMP cause the following link error:
> >
> > mce_amd.c:(.cpuinit.text+0x4723): undefined reference to `cpu_llc_shared_map'
> >
> > Fix it.
> >
> > Acked-by: Randy Dunlap <[email protected]>
> > Link: http://lkml.kernel.org/r/[email protected]
> > Signed-off-by: Borislav Petkov <[email protected]>
> [...]
> > config X86_MCE_AMD
> > def_bool y
> > prompt "AMD MCE features"
> > - depends on X86_MCE && X86_LOCAL_APIC
> > + depends on X86_MCE && X86_LOCAL_APIC && SMP
>
> Is this feature truly irrelevant on UP systems? I ask because I've
> always enabled this option on my UP AMD systems in the past...

No, you're right. Thanks for the suggestion. Scratch that version, I'll
think of a better fix.

Thanks.

--
Regards/Gruss,
Boris.

2012-02-03 19:18:09

by Borislav Petkov

[permalink] [raw]
Subject: MCE, AMD: Hide smp-only code around CONFIG_SMP

On Thu, Feb 02, 2012 at 09:24:28PM +0100, Borislav Petkov wrote:
> > Is this feature truly irrelevant on UP systems? I ask because I've
> > always enabled this option on my UP AMD systems in the past...
>
> No, you're right. Thanks for the suggestion. Scratch that version, I'll
> think of a better fix.

Ok, I think I got it, pls take a look and scream if something's amiss.

@Randy: it builds fine with your randconfig and with mine default one;
I'd appreciate if you could run it too, just in case.

Thanks.

--
From: Borislav Petkov <[email protected]>
Date: Fri, 3 Feb 2012 18:07:54 +0100
Subject: [PATCH] MCE, AMD: Hide smp-only code around CONFIG_SMP

141168c36cde ("x86: Simplify code by removing a !SMP #ifdefs from
'struct cpuinfo_x86'") removed a bunch of CONFIG_SMP ifdefs around code
touching struct cpuinfo_x86 members but also caused the following build
error with Randy's randconfigs:

mce_amd.c:(.cpuinit.text+0x4723): undefined reference to `cpu_llc_shared_map'

Restore the #ifdef in threshold_create_bank() which creates symlinks on
the non-BSP CPUs.

Cc: Kevin Winchester <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Randy Dunlap <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/kernel/cpu/mcheck/mce_amd.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index 786e76a..e4eeaaf 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -528,6 +528,7 @@ static __cpuinit int threshold_create_bank(unsigned int cpu, unsigned int bank)

sprintf(name, "threshold_bank%i", bank);

+#ifdef CONFIG_SMP
if (cpu_data(cpu).cpu_core_id && shared_bank[bank]) { /* symlink */
i = cpumask_first(cpu_llc_shared_mask(cpu));

@@ -553,6 +554,7 @@ static __cpuinit int threshold_create_bank(unsigned int cpu, unsigned int bank)

goto out;
}
+#endif

b = kzalloc(sizeof(struct threshold_bank), GFP_KERNEL);
if (!b) {
--
1.7.9


--
Regards/Gruss,
Boris.

2012-02-03 21:07:56

by Randy Dunlap

[permalink] [raw]
Subject: Re: MCE, AMD: Hide smp-only code around CONFIG_SMP

On 02/03/2012 11:18 AM, Borislav Petkov wrote:
> On Thu, Feb 02, 2012 at 09:24:28PM +0100, Borislav Petkov wrote:
>>> Is this feature truly irrelevant on UP systems? I ask because I've
>>> always enabled this option on my UP AMD systems in the past...
>>
>> No, you're right. Thanks for the suggestion. Scratch that version, I'll
>> think of a better fix.
>
> Ok, I think I got it, pls take a look and scream if something's amiss.
>
> @Randy: it builds fine with your randconfig and with mine default one;
> I'd appreciate if you could run it too, just in case.

Yes, it's fine for me also.
Thanks.

Acked-by: Randy Dunlap <[email protected]>


> Thanks.
>
> --
> From: Borislav Petkov <[email protected]>
> Date: Fri, 3 Feb 2012 18:07:54 +0100
> Subject: [PATCH] MCE, AMD: Hide smp-only code around CONFIG_SMP
>
> 141168c36cde ("x86: Simplify code by removing a !SMP #ifdefs from
> 'struct cpuinfo_x86'") removed a bunch of CONFIG_SMP ifdefs around code
> touching struct cpuinfo_x86 members but also caused the following build
> error with Randy's randconfigs:
>
> mce_amd.c:(.cpuinit.text+0x4723): undefined reference to `cpu_llc_shared_map'
>
> Restore the #ifdef in threshold_create_bank() which creates symlinks on
> the non-BSP CPUs.
>
> Cc: Kevin Winchester <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Randy Dunlap <[email protected]>
> Link: http://lkml.kernel.org/r/[email protected]
> Signed-off-by: Borislav Petkov <[email protected]>
> ---
> arch/x86/kernel/cpu/mcheck/mce_amd.c | 2 ++
> 1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
> index 786e76a..e4eeaaf 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
> @@ -528,6 +528,7 @@ static __cpuinit int threshold_create_bank(unsigned int cpu, unsigned int bank)
>
> sprintf(name, "threshold_bank%i", bank);
>
> +#ifdef CONFIG_SMP
> if (cpu_data(cpu).cpu_core_id && shared_bank[bank]) { /* symlink */
> i = cpumask_first(cpu_llc_shared_mask(cpu));
>
> @@ -553,6 +554,7 @@ static __cpuinit int threshold_create_bank(unsigned int cpu, unsigned int bank)
>
> goto out;
> }
> +#endif
>
> b = kzalloc(sizeof(struct threshold_bank), GFP_KERNEL);
> if (!b) {


--
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

2012-02-07 09:58:07

by Ingo Molnar

[permalink] [raw]
Subject: Re: MCE, AMD: Hide smp-only code around CONFIG_SMP


* Borislav Petkov <[email protected]> wrote:

> On Thu, Feb 02, 2012 at 09:24:28PM +0100, Borislav Petkov wrote:
> > > Is this feature truly irrelevant on UP systems? I ask because I've
> > > always enabled this option on my UP AMD systems in the past...
> >
> > No, you're right. Thanks for the suggestion. Scratch that version, I'll
> > think of a better fix.
>
> Ok, I think I got it, pls take a look and scream if something's amiss.
>
> @Randy: it builds fine with your randconfig and with mine default one;
> I'd appreciate if you could run it too, just in case.
>
> Thanks.
>
> --
> From: Borislav Petkov <[email protected]>
> Date: Fri, 3 Feb 2012 18:07:54 +0100
> Subject: [PATCH] MCE, AMD: Hide smp-only code around CONFIG_SMP
>
> 141168c36cde ("x86: Simplify code by removing a !SMP #ifdefs from
> 'struct cpuinfo_x86'") removed a bunch of CONFIG_SMP ifdefs around code
> touching struct cpuinfo_x86 members but also caused the following build
> error with Randy's randconfigs:
>
> mce_amd.c:(.cpuinit.text+0x4723): undefined reference to `cpu_llc_shared_map'
>
> Restore the #ifdef in threshold_create_bank() which creates symlinks on
> the non-BSP CPUs.
>
> Cc: Kevin Winchester <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Randy Dunlap <[email protected]>
> Link: http://lkml.kernel.org/r/[email protected]
> Signed-off-by: Borislav Petkov <[email protected]>
> ---
> arch/x86/kernel/cpu/mcheck/mce_amd.c | 2 ++
> 1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
> index 786e76a..e4eeaaf 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
> @@ -528,6 +528,7 @@ static __cpuinit int threshold_create_bank(unsigned int cpu, unsigned int bank)
>
> sprintf(name, "threshold_bank%i", bank);
>
> +#ifdef CONFIG_SMP
> if (cpu_data(cpu).cpu_core_id && shared_bank[bank]) { /* symlink */
> i = cpumask_first(cpu_llc_shared_mask(cpu));

Could we please just define an obvious cpu_llc_shared_mask on UP
(one bit long and set to 1) and not reintroduce the ugly
#ifdef CONFIG_SMP?

Thanks,

Ingo

2012-02-08 00:41:42

by Kevin Winchester

[permalink] [raw]
Subject: Re: MCE, AMD: Hide smp-only code around CONFIG_SMP

On 7 February 2012 05:57, Ingo Molnar <[email protected]> wrote:
>
> * Borislav Petkov <[email protected]> wrote:
>
>> On Thu, Feb 02, 2012 at 09:24:28PM +0100, Borislav Petkov wrote:
>> > > Is this feature truly irrelevant on UP systems? ?I ask because I've
>> > > always enabled this option on my UP AMD systems in the past...
>> >
>> > No, you're right. Thanks for the suggestion. Scratch that version, I'll
>> > think of a better fix.
>>
>> Ok, I think I got it, pls take a look and scream if something's amiss.
>>
>> @Randy: it builds fine with your randconfig and with mine default one;
>> I'd appreciate if you could run it too, just in case.
>>
>> Thanks.
>>
>> --
>> From: Borislav Petkov <[email protected]>
>> Date: Fri, 3 Feb 2012 18:07:54 +0100
>> Subject: [PATCH] MCE, AMD: Hide smp-only code around CONFIG_SMP
>>
>> 141168c36cde ("x86: Simplify code by removing a !SMP #ifdefs from
>> 'struct cpuinfo_x86'") removed a bunch of CONFIG_SMP ifdefs around code
>> touching struct cpuinfo_x86 members but also caused the following build
>> error with Randy's randconfigs:
>>
>> mce_amd.c:(.cpuinit.text+0x4723): undefined reference to `cpu_llc_shared_map'
>>
>> Restore the #ifdef in threshold_create_bank() which creates symlinks on
>> the non-BSP CPUs.
>>
>
> Could we please just define an obvious cpu_llc_shared_mask on UP
> (one bit long and set to 1) and not reintroduce the ugly
> #ifdef CONFIG_SMP?
>

I ran into this when I was working on the original patch, and I
thought I had solved all of the issues like this. Unfortunately not.

The problem is that there doesn't seem to be a great place to define
this (along with a few other functions like this) on UP. We have
<linux/smp.h> including <asm/smp.h> in the CONFIG_SMP case, but no
arch-specific defs for UP. Would it be crazy to have <linux/smp.h>
include a new <asm/up.h> for !CONFIG_SMP?

I am willing to prepare the patch, if someone can tell me where to
define cpu_llc_shared_mask for UP?

Thanks,

--
Kevin Winchester

2012-02-08 10:20:03

by Borislav Petkov

[permalink] [raw]
Subject: Re: MCE, AMD: Hide smp-only code around CONFIG_SMP

On Tue, Feb 07, 2012 at 10:57:46AM +0100, Ingo Molnar wrote:
> Could we please just define an obvious cpu_llc_shared_mask on UP
> (one bit long and set to 1) and not reintroduce the ugly
> #ifdef CONFIG_SMP?

Ok, here's a tentative proposed solution where I was trying not to
fatfinger the cpumask magic. Let me know if this is similar to what you
had in mind. The call to cpu_llc_shared_mask(int cpu) on UP might cause
a problem if the cpu arg is not 0 but it should not happen anyway and
there should be other problems with the caller side anyway.

Thanks.

--
From: Borislav Petkov <[email protected]>
Date: Tue, 7 Feb 2012 21:40:22 +0100
Subject: [PATCH] MCE, AMD: Fix !CONFIG_SMP build

141168c36cde ("x86: Simplify code by removing a !SMP #ifdefs from
'struct cpuinfo_x86'") removed a bunch of CONFIG_SMP ifdefs around
code touching struct cpuinfo_x86 members but also caused the
following build error with Randy's randconfigs:

mce_amd.c:(.cpuinit.text+0x4723): undefined reference to `cpu_llc_shared_map'

Fix that by adding a UP version of the cpu_llc_shared_map, as Ingo
suggested.

Cc: Kevin Winchester <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Randy Dunlap <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/include/asm/smp.h | 9 ++++++++-
1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 0434c40..118754f 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -33,8 +33,15 @@ static inline bool cpu_has_ht_siblings(void)

DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map);
DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
-/* cpus sharing the last level cache: */
+
+#ifdef CONFIG_SMP
+/* CPUs sharing the last level cache: */
DECLARE_PER_CPU(cpumask_var_t, cpu_llc_shared_map);
+#else
+static DECLARE_BITMAP(cpu_llc_shared_bits, NR_CPUS) __read_mostly = { [0] = 1UL };
+static struct cpumask *const cpu_llc_shared_map = to_cpumask(cpu_llc_shared_bits);
+#endif
+
DECLARE_PER_CPU(u16, cpu_llc_id);
DECLARE_PER_CPU(int, cpu_number);

--
1.7.9

--
Regards/Gruss,
Boris.

2012-02-08 12:22:14

by Kevin Winchester

[permalink] [raw]
Subject: Re: MCE, AMD: Hide smp-only code around CONFIG_SMP

On 8 February 2012 06:19, Borislav Petkov <[email protected]> wrote:
> On Tue, Feb 07, 2012 at 10:57:46AM +0100, Ingo Molnar wrote:
>> Could we please just define an obvious cpu_llc_shared_mask on UP
>> (one bit long and set to 1) and not reintroduce the ugly
>> #ifdef CONFIG_SMP?
>
> Ok, here's a tentative proposed solution where I was trying not to
> fatfinger the cpumask magic. Let me know if this is similar to what you
> had in mind. The call to cpu_llc_shared_mask(int cpu) on UP might cause
> a problem if the cpu arg is not 0 but it should not happen anyway and
> there should be other problems with the caller side anyway.
>
> Thanks.
>
> --
> From: Borislav Petkov <[email protected]>
> Date: Tue, 7 Feb 2012 21:40:22 +0100
> Subject: [PATCH] MCE, AMD: Fix !CONFIG_SMP build
>
> 141168c36cde ("x86: Simplify code by removing a !SMP #ifdefs from
> 'struct cpuinfo_x86'") removed a bunch of CONFIG_SMP ifdefs around
> code touching struct cpuinfo_x86 members but also caused the
> following build error with Randy's randconfigs:
>
> mce_amd.c:(.cpuinit.text+0x4723): undefined reference to `cpu_llc_shared_map'
>
> Fix that by adding a UP version of the cpu_llc_shared_map, as Ingo
> suggested.
>
> Cc: Kevin Winchester <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Randy Dunlap <[email protected]>
> Link: http://lkml.kernel.org/r/[email protected]
> Signed-off-by: Borislav Petkov <[email protected]>
> ---
> ?arch/x86/include/asm/smp.h | ? ?9 ++++++++-
> ?1 files changed, 8 insertions(+), 1 deletions(-)
>
> diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
> index 0434c40..118754f 100644
> --- a/arch/x86/include/asm/smp.h
> +++ b/arch/x86/include/asm/smp.h
> @@ -33,8 +33,15 @@ static inline bool cpu_has_ht_siblings(void)
>
> ?DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map);
> ?DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
> -/* cpus sharing the last level cache: */
> +
> +#ifdef CONFIG_SMP
> +/* CPUs sharing the last level cache: */
> ?DECLARE_PER_CPU(cpumask_var_t, cpu_llc_shared_map);
> +#else
> +static DECLARE_BITMAP(cpu_llc_shared_bits, NR_CPUS) __read_mostly = { [0] = 1UL };
> +static struct cpumask *const cpu_llc_shared_map = to_cpumask(cpu_llc_shared_bits);
> +#endif
> +
> ?DECLARE_PER_CPU(u16, cpu_llc_id);
> ?DECLARE_PER_CPU(int, cpu_number);
>
> --
> 1.7.9
>

Does this work? arch/x86/kernel/cpu/mcheck/mce_amd.c includes
<linux/smp.h> rather than <asm/smp.h> directly. And <linux/smp.h>
only includes <asm/smp.h> for the CONFIG_SMP case. Or perhaps one of
the other includes in mce_amd.c includes <asm/smp.h>?

--
Kevin Winchester

2012-02-08 13:05:49

by Borislav Petkov

[permalink] [raw]
Subject: Re: MCE, AMD: Hide smp-only code around CONFIG_SMP

On Wed, Feb 08, 2012 at 08:22:11AM -0400, Kevin Winchester wrote:
> Does this work? arch/x86/kernel/cpu/mcheck/mce_amd.c includes
> <linux/smp.h> rather than <asm/smp.h> directly. And <linux/smp.h>
> only includes <asm/smp.h> for the CONFIG_SMP case. Or perhaps one of
> the other includes in mce_amd.c includes <asm/smp.h>?

It looks like it:

$ make arch/x86/kernel/cpu/mcheck/mce_amd.i
$ grep 'asm/smp.h' arch/x86/kernel/cpu/mcheck/mce_amd.i
# 1 "/home/boris/kernel/linux-2.6/arch/x86/include/asm/smp.h" 1
# 13 "/home/boris/kernel/linux-2.6/arch/x86/include/asm/smp.h"
# 14 "/home/boris/kernel/linux-2.6/arch/x86/include/asm/smp.h" 2
# 16 "/home/boris/kernel/linux-2.6/arch/x86/include/asm/smp.h" 2

and this one pulls in the cpu_llc* crap.

# 183 "/home/boris/kernel/linux-2.6/arch/x86/include/asm/smp.h"
# 225 "/home/boris/kernel/linux-2.6/arch/x86/include/asm/smp.h"
# 236 "/home/boris/kernel/linux-2.6/arch/x86/include/asm/smp.h"

So some other headers seem to pull in asm/smp.h.

HTH.

--
Regards/Gruss,
Boris.

2012-02-09 08:07:08

by Ingo Molnar

[permalink] [raw]
Subject: Re: MCE, AMD: Hide smp-only code around CONFIG_SMP


* Borislav Petkov <[email protected]> wrote:

> +++ b/arch/x86/include/asm/smp.h
> @@ -33,8 +33,15 @@ static inline bool cpu_has_ht_siblings(void)
>
> DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map);
> DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
> -/* cpus sharing the last level cache: */
> +
> +#ifdef CONFIG_SMP
> +/* CPUs sharing the last level cache: */
> DECLARE_PER_CPU(cpumask_var_t, cpu_llc_shared_map);
> +#else
> +static DECLARE_BITMAP(cpu_llc_shared_bits, NR_CPUS) __read_mostly = { [0] = 1UL };
> +static struct cpumask *const cpu_llc_shared_map = to_cpumask(cpu_llc_shared_bits);
> +#endif

Why not just expose it like on SMP?

We want to *reduce* the specialness of UP, not increase it - one
more word of .data and .text does not matter much - UP is
becoming more and more an oddball, rarely tested config. By the
time these changes hit any real boxes it will be even more
oddball.

Thanks,

Ingo

2012-02-10 00:00:48

by Kevin Winchester

[permalink] [raw]
Subject: Re: MCE, AMD: Hide smp-only code around CONFIG_SMP

On 9 February 2012 04:06, Ingo Molnar <[email protected]> wrote:
>
> * Borislav Petkov <[email protected]> wrote:
>
>> +++ b/arch/x86/include/asm/smp.h
>> @@ -33,8 +33,15 @@ static inline bool cpu_has_ht_siblings(void)
>>
>> ?DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map);
>> ?DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
>> -/* cpus sharing the last level cache: */
>> +
>> +#ifdef CONFIG_SMP
>> +/* CPUs sharing the last level cache: */
>> ?DECLARE_PER_CPU(cpumask_var_t, cpu_llc_shared_map);
>> +#else
>> +static DECLARE_BITMAP(cpu_llc_shared_bits, NR_CPUS) __read_mostly = { [0] = 1UL };
>> +static struct cpumask *const cpu_llc_shared_map = to_cpumask(cpu_llc_shared_bits);
>> +#endif
>
> Why not just expose it like on SMP?
>
> We want to *reduce* the specialness of UP, not increase it - one
> more word of .data and .text does not matter much - UP is
> becoming more and more an oddball, rarely tested config. By the
> time these changes hit any real boxes it will be even more
> oddball.
>

It seems that cpu_llc_shared_map is actually defined in
arch/x86/kernel/smpboot.c, which is not compiled/linked for UP builds.
Is there an equivalent file for UP that could be used instead, or
could the:

DEFINE_PER_CPU(cpumask_var_t, cpu_llc_shared_map);

be moved to some other file?

Generally, it sounds like you might approve of an eventual merging of
the boot paths for SMP and UP. Is that true? I wonder how much work
that would be. That would really reduce the specialness of UP.

--
Kevin Winchester

2012-02-11 14:07:50

by Ingo Molnar

[permalink] [raw]
Subject: Re: MCE, AMD: Hide smp-only code around CONFIG_SMP


* Kevin Winchester <[email protected]> wrote:

> On 9 February 2012 04:06, Ingo Molnar <[email protected]> wrote:
> >
> > * Borislav Petkov <[email protected]> wrote:
> >
> >> +++ b/arch/x86/include/asm/smp.h
> >> @@ -33,8 +33,15 @@ static inline bool cpu_has_ht_siblings(void)
> >>
> >> ?DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map);
> >> ?DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
> >> -/* cpus sharing the last level cache: */
> >> +
> >> +#ifdef CONFIG_SMP
> >> +/* CPUs sharing the last level cache: */
> >> ?DECLARE_PER_CPU(cpumask_var_t, cpu_llc_shared_map);
> >> +#else
> >> +static DECLARE_BITMAP(cpu_llc_shared_bits, NR_CPUS) __read_mostly = { [0] = 1UL };
> >> +static struct cpumask *const cpu_llc_shared_map = to_cpumask(cpu_llc_shared_bits);
> >> +#endif
> >
> > Why not just expose it like on SMP?
> >
> > We want to *reduce* the specialness of UP, not increase it - one
> > more word of .data and .text does not matter much - UP is
> > becoming more and more an oddball, rarely tested config. By the
> > time these changes hit any real boxes it will be even more
> > oddball.
> >
>
> It seems that cpu_llc_shared_map is actually defined in
> arch/x86/kernel/smpboot.c, which is not compiled/linked for UP
> builds.
> Is there an equivalent file for UP that could be used
> instead, or could the:
>
> DEFINE_PER_CPU(cpumask_var_t, cpu_llc_shared_map);
>
> be moved to some other file?

Yes, it should be moved into struct cpuinfo_x86, and thus we'd
remove cpu_llc_shared_map altogether, it would be named
cpu->llc_shared_map or so - taking up a single bit (or maybe
zero bits) on UP.

> Generally, it sounds like you might approve of an eventual
> merging of the boot paths for SMP and UP. Is that true? I
> wonder how much work that would be. That would really reduce
> the specialness of UP.

I generally approve just about any patch that works and reduces
complexity! :-) The boot path is rather ambitious, but if you
want to try, feel free ...

Thanks,

Ingo

2012-02-12 00:24:48

by Kevin Winchester

[permalink] [raw]
Subject: [PATCH] x86: Move per cpu cpu_llc_shared_map to a field in struct cpuinfo_x86

Commit 141168c36cde ("x86: Simplify code by removing a !SMP #ifdefs from
'struct cpuinfo_x86'") caused the compilation error:

mce_amd.c:(.cpuinit.text+0x4723): undefined reference to 'cpu_llc_shared_map'

by removing an #ifdef CONFIG_SMP around a block containing a reference
to cpu_llc_shared_map. Rather than replace the #ifdef, move
cpu_llc_shared_map to be a new field llc_shared_map in struct
cpuinfo_x86 and adjust all references to cpu_llc_shared_map.

Signed-off-by: Kevin Winchester <[email protected]>
---
arch/x86/include/asm/processor.h | 1 +
arch/x86/include/asm/smp.h | 6 ------
arch/x86/kernel/cpu/intel_cacheinfo.c | 4 ++--
arch/x86/kernel/cpu/mcheck/mce_amd.c | 7 ++++---
arch/x86/kernel/smpboot.c | 15 ++++++---------
arch/x86/xen/smp.c | 1 -
6 files changed, 13 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index aa9088c..50389d2 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -110,6 +110,7 @@ struct cpuinfo_x86 {
/* Index into per_cpu list: */
u16 cpu_index;
u32 microcode;
+ cpumask_var_t llc_shared_map;
} __attribute__((__aligned__(SMP_CACHE_BYTES)));

#define X86_VENDOR_INTEL 0
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 0434c40..f7599d0 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -34,7 +34,6 @@ static inline bool cpu_has_ht_siblings(void)
DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map);
DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
/* cpus sharing the last level cache: */
-DECLARE_PER_CPU(cpumask_var_t, cpu_llc_shared_map);
DECLARE_PER_CPU(u16, cpu_llc_id);
DECLARE_PER_CPU(int, cpu_number);

@@ -48,11 +47,6 @@ static inline struct cpumask *cpu_core_mask(int cpu)
return per_cpu(cpu_core_map, cpu);
}

-static inline struct cpumask *cpu_llc_shared_mask(int cpu)
-{
- return per_cpu(cpu_llc_shared_map, cpu);
-}
-
DECLARE_EARLY_PER_CPU(u16, x86_cpu_to_apicid);
DECLARE_EARLY_PER_CPU(u16, x86_bios_cpu_apicid);
#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86_32)
diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c
index 73d08ed..03807f0 100644
--- a/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -734,11 +734,11 @@ static int __cpuinit cache_shared_amd_cpu_map_setup(unsigned int cpu, int index)
ret = 0;
if (index == 3) {
ret = 1;
- for_each_cpu(i, cpu_llc_shared_mask(cpu)) {
+ for_each_cpu(i, c->llc_shared_map) {
if (!per_cpu(ici_cpuid4_info, i))
continue;
this_leaf = CPUID4_INFO_IDX(i, index);
- for_each_cpu(sibling, cpu_llc_shared_mask(cpu)) {
+ for_each_cpu(sibling, c->llc_shared_map) {
if (!cpu_online(sibling))
continue;
set_bit(sibling, this_leaf->shared_cpu_map);
diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index 786e76a..56899e1 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -525,11 +525,12 @@ static __cpuinit int threshold_create_bank(unsigned int cpu, unsigned int bank)
struct threshold_bank *b = NULL;
struct device *dev = mce_device[cpu];
char name[32];
+ struct cpuinfo_x86 *c = &cpu_data(cpu);

sprintf(name, "threshold_bank%i", bank);

- if (cpu_data(cpu).cpu_core_id && shared_bank[bank]) { /* symlink */
- i = cpumask_first(cpu_llc_shared_mask(cpu));
+ if (c->cpu_core_id && shared_bank[bank]) { /* symlink */
+ i = cpumask_first(c->llc_shared_map);

/* first core not up yet */
if (cpu_data(i).cpu_core_id)
@@ -548,7 +549,7 @@ static __cpuinit int threshold_create_bank(unsigned int cpu, unsigned int bank)
if (err)
goto out;

- cpumask_copy(b->cpus, cpu_llc_shared_mask(cpu));
+ cpumask_copy(b->cpus, c->llc_shared_map);
per_cpu(threshold_banks, cpu)[bank] = b;

goto out;
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 66d250c..2b18365 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -127,8 +127,6 @@ EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
DEFINE_PER_CPU(cpumask_var_t, cpu_core_map);
EXPORT_PER_CPU_SYMBOL(cpu_core_map);

-DEFINE_PER_CPU(cpumask_var_t, cpu_llc_shared_map);
-
/* Per CPU bogomips and other parameters */
DEFINE_PER_CPU_SHARED_ALIGNED(struct cpuinfo_x86, cpu_info);
EXPORT_PER_CPU_SYMBOL(cpu_info);
@@ -337,8 +335,8 @@ static void __cpuinit link_thread_siblings(int cpu1, int cpu2)
cpumask_set_cpu(cpu2, cpu_sibling_mask(cpu1));
cpumask_set_cpu(cpu1, cpu_core_mask(cpu2));
cpumask_set_cpu(cpu2, cpu_core_mask(cpu1));
- cpumask_set_cpu(cpu1, cpu_llc_shared_mask(cpu2));
- cpumask_set_cpu(cpu2, cpu_llc_shared_mask(cpu1));
+ cpumask_set_cpu(cpu1, cpu_data(cpu2).llc_shared_map);
+ cpumask_set_cpu(cpu2, cpu_data(cpu1).llc_shared_map);
}


@@ -367,7 +365,7 @@ void __cpuinit set_cpu_sibling_map(int cpu)
cpumask_set_cpu(cpu, cpu_sibling_mask(cpu));
}

- cpumask_set_cpu(cpu, cpu_llc_shared_mask(cpu));
+ cpumask_set_cpu(cpu, c->llc_shared_map);

if (__this_cpu_read(cpu_info.x86_max_cores) == 1) {
cpumask_copy(cpu_core_mask(cpu), cpu_sibling_mask(cpu));
@@ -378,8 +376,8 @@ void __cpuinit set_cpu_sibling_map(int cpu)
for_each_cpu(i, cpu_sibling_setup_mask) {
if (per_cpu(cpu_llc_id, cpu) != BAD_APICID &&
per_cpu(cpu_llc_id, cpu) == per_cpu(cpu_llc_id, i)) {
- cpumask_set_cpu(i, cpu_llc_shared_mask(cpu));
- cpumask_set_cpu(cpu, cpu_llc_shared_mask(i));
+ cpumask_set_cpu(i, c->llc_shared_map);
+ cpumask_set_cpu(cpu, cpu_data(i).llc_shared_map);
}
if (c->phys_proc_id == cpu_data(i).phys_proc_id) {
cpumask_set_cpu(i, cpu_core_mask(cpu));
@@ -418,7 +416,7 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
!(cpu_has(c, X86_FEATURE_AMD_DCM)))
return cpu_core_mask(cpu);
else
- return cpu_llc_shared_mask(cpu);
+ return c->llc_shared_map;
}

static void impress_friends(void)
@@ -1053,7 +1051,6 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
for_each_possible_cpu(i) {
zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
- zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), GFP_KERNEL);
}
set_cpu_sibling_map(0);

diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 041d4fe..a898ed5 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -225,7 +225,6 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
for_each_possible_cpu(i) {
zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
- zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), GFP_KERNEL);
}
set_cpu_sibling_map(0);

--
1.7.9

2012-02-12 00:31:21

by Kevin Winchester

[permalink] [raw]
Subject: Re: MCE, AMD: Hide smp-only code around CONFIG_SMP

On 11 February 2012 10:07, Ingo Molnar <[email protected]> wrote:
>
> * Kevin Winchester <[email protected]> wrote:
>
>> On 9 February 2012 04:06, Ingo Molnar <[email protected]> wrote:
>> >
>> > * Borislav Petkov <[email protected]> wrote:
>> >
>> >> +++ b/arch/x86/include/asm/smp.h
>> >> @@ -33,8 +33,15 @@ static inline bool cpu_has_ht_siblings(void)
>> >>
>> >> ?DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map);
>> >> ?DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
>> >> -/* cpus sharing the last level cache: */
>> >> +
>> >> +#ifdef CONFIG_SMP
>> >> +/* CPUs sharing the last level cache: */
>> >> ?DECLARE_PER_CPU(cpumask_var_t, cpu_llc_shared_map);
>> >> +#else
>> >> +static DECLARE_BITMAP(cpu_llc_shared_bits, NR_CPUS) __read_mostly = { [0] = 1UL };
>> >> +static struct cpumask *const cpu_llc_shared_map = to_cpumask(cpu_llc_shared_bits);
>> >> +#endif
>> >
>> > Why not just expose it like on SMP?
>> >
>> > We want to *reduce* the specialness of UP, not increase it - one
>> > more word of .data and .text does not matter much - UP is
>> > becoming more and more an oddball, rarely tested config. By the
>> > time these changes hit any real boxes it will be even more
>> > oddball.
>> >
>>
>> It seems that cpu_llc_shared_map is actually defined in
>> arch/x86/kernel/smpboot.c, which is not compiled/linked for UP
>> builds.
>> ?Is there an equivalent file for UP that could be used
>> instead, or could the:
>>
>> DEFINE_PER_CPU(cpumask_var_t, cpu_llc_shared_map);
>>
>> be moved to some other file?
>
> Yes, it should be moved into struct cpuinfo_x86, and thus we'd
> remove cpu_llc_shared_map altogether, it would be named
> cpu->llc_shared_map or so - taking up a single bit (or maybe
> zero bits) on UP.

I just sent out a patch trying this out. I built and booted it on SMP
and UP and didn't see any issues, but I don't have more than my one PC
on which to test, so I hope I didn't miss anything. I did notice a
few other per cpu variables (e.g. cpu_llc_id) that perhaps could
perhaps use the same treatment. Where would we want to draw the line
here?
>
>> Generally, it sounds like you might approve of an eventual
>> merging of the boot paths for SMP and UP. ?Is that true? ?I
>> wonder how much work that would be. ?That would really reduce
>> the specialness of UP.
>
> I generally approve just about any patch that works and reduces
> complexity! :-) The boot path is rather ambitious, but if you
> want to try, feel free ...
>

Yes, having walked through the boot path a little, it does seem rather
ambitious. The UP/SMP division seems to be coded right into
init/main.c, rather than being at the architecture level, so it would
be hard to make changes without disrupting other arches.

--
Kevin Winchester

2012-02-12 02:18:56

by Kevin Winchester

[permalink] [raw]
Subject: Re: [PATCH] x86: Move per cpu cpu_llc_shared_map to a field in struct cpuinfo_x86

On 11 February 2012 20:24, Kevin Winchester <[email protected]> wrote:
> Commit 141168c36cde ("x86: Simplify code by removing a !SMP #ifdefs from
> 'struct cpuinfo_x86'") caused the compilation error:
>
> mce_amd.c:(.cpuinit.text+0x4723): undefined reference to 'cpu_llc_shared_map'
>
> by removing an #ifdef CONFIG_SMP around a block containing a reference
> to cpu_llc_shared_map. ?Rather than replace the #ifdef, move
> cpu_llc_shared_map to be a new field llc_shared_map in struct
> cpuinfo_x86 and adjust all references to cpu_llc_shared_map.
>
> Signed-off-by: Kevin Winchester <[email protected]>
> ---
> ?arch/x86/include/asm/processor.h ? ? ?| ? ?1 +
> ?arch/x86/include/asm/smp.h ? ? ? ? ? ?| ? ?6 ------
> ?arch/x86/kernel/cpu/intel_cacheinfo.c | ? ?4 ++--
> ?arch/x86/kernel/cpu/mcheck/mce_amd.c ?| ? ?7 ++++---
> ?arch/x86/kernel/smpboot.c ? ? ? ? ? ? | ? 15 ++++++---------
> ?arch/x86/xen/smp.c ? ? ? ? ? ? ? ? ? ?| ? ?1 -
> ?6 files changed, 13 insertions(+), 21 deletions(-)
>
>
> ?static void impress_friends(void)
> @@ -1053,7 +1051,6 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
> ? ? ? ?for_each_possible_cpu(i) {
> ? ? ? ? ? ? ? ?zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
> ? ? ? ? ? ? ? ?zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
> - ? ? ? ? ? ? ? zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), GFP_KERNEL);
> ? ? ? ?}
> ? ? ? ?set_cpu_sibling_map(0);
>
> diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
> index 041d4fe..a898ed5 100644
> --- a/arch/x86/xen/smp.c
> +++ b/arch/x86/xen/smp.c
> @@ -225,7 +225,6 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
> ? ? ? ?for_each_possible_cpu(i) {
> ? ? ? ? ? ? ? ?zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
> ? ? ? ? ? ? ? ?zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
> - ? ? ? ? ? ? ? zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), GFP_KERNEL);
> ? ? ? ?}
> ? ? ? ?set_cpu_sibling_map(0);
>

I just realized that I took out a couple of allocations here for
cpu_llc_shared_map, without replacing them. Am I leaving
cpuinfo_x86.llc_shared_map unallocated then, and just writing to
whatever address that field happened to get?

2012-02-12 11:19:35

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86: Move per cpu cpu_llc_shared_map to a field in struct cpuinfo_x86


* Kevin Winchester <[email protected]> wrote:

> On 11 February 2012 20:24, Kevin Winchester <[email protected]> wrote:
> > Commit 141168c36cde ("x86: Simplify code by removing a !SMP #ifdefs from
> > 'struct cpuinfo_x86'") caused the compilation error:
> >
> > mce_amd.c:(.cpuinit.text+0x4723): undefined reference to 'cpu_llc_shared_map'
> >
> > by removing an #ifdef CONFIG_SMP around a block containing a reference
> > to cpu_llc_shared_map. ?Rather than replace the #ifdef, move
> > cpu_llc_shared_map to be a new field llc_shared_map in struct
> > cpuinfo_x86 and adjust all references to cpu_llc_shared_map.
> >
> > Signed-off-by: Kevin Winchester <[email protected]>
> > ---
> > ?arch/x86/include/asm/processor.h ? ? ?| ? ?1 +
> > ?arch/x86/include/asm/smp.h ? ? ? ? ? ?| ? ?6 ------
> > ?arch/x86/kernel/cpu/intel_cacheinfo.c | ? ?4 ++--
> > ?arch/x86/kernel/cpu/mcheck/mce_amd.c ?| ? ?7 ++++---
> > ?arch/x86/kernel/smpboot.c ? ? ? ? ? ? | ? 15 ++++++---------
> > ?arch/x86/xen/smp.c ? ? ? ? ? ? ? ? ? ?| ? ?1 -
> > ?6 files changed, 13 insertions(+), 21 deletions(-)
> >
> >
> > ?static void impress_friends(void)
> > @@ -1053,7 +1051,6 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
> > ? ? ? ?for_each_possible_cpu(i) {
> > ? ? ? ? ? ? ? ?zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
> > ? ? ? ? ? ? ? ?zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
> > - ? ? ? ? ? ? ? zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), GFP_KERNEL);
> > ? ? ? ?}
> > ? ? ? ?set_cpu_sibling_map(0);
> >
> > diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
> > index 041d4fe..a898ed5 100644
> > --- a/arch/x86/xen/smp.c
> > +++ b/arch/x86/xen/smp.c
> > @@ -225,7 +225,6 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
> > ? ? ? ?for_each_possible_cpu(i) {
> > ? ? ? ? ? ? ? ?zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
> > ? ? ? ? ? ? ? ?zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
> > - ? ? ? ? ? ? ? zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), GFP_KERNEL);
> > ? ? ? ?}
> > ? ? ? ?set_cpu_sibling_map(0);
> >
>
> I just realized that I took out a couple of allocations here
> for cpu_llc_shared_map, without replacing them. Am I leaving
> cpuinfo_x86.llc_shared_map unallocated then, and just writing
> to whatever address that field happened to get?

That will probably crash CONFIG_CPUMASK_OFFSTACK=y kernels.

The simplest approach would be to use a cpumask_t (i.e. not a
variable cpumask_var_t one), on the [valid looking] assumption
that cpuinfo_x86 gets allocated in sane ways - i.e. never on the
kernel stack and such.

The before/after vmlinux 'size' result should be inspected, with
NR_CPUs set to 4096 and OFFSTACK activated in the .config, to
see how bad the size effect is - but I think it should not be
too bad.

Thanks,

Ingo

2012-02-12 11:23:08

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] x86: Move per cpu cpu_llc_shared_map to a field in struct cpuinfo_x86

On Sat, Feb 11, 2012 at 10:18:53PM -0400, Kevin Winchester wrote:
> > @@ -225,7 +225,6 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
> >        for_each_possible_cpu(i) {
> >                zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
> >                zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
> > -               zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), GFP_KERNEL);
> >        }
> >        set_cpu_sibling_map(0);
> >
>
> I just realized that I took out a couple of allocations here for
> cpu_llc_shared_map, without replacing them. Am I leaving
> cpuinfo_x86.llc_shared_map unallocated then, and just writing to
> whatever address that field happened to get?

AFAICT, yes, you need to alloc the cpumask too, i.e. something like

zalloc_cpumask_var(cpu_info(i).llc_shared_map, GFP_KERNEL);

assuming this happens before the cpumask gets initted in
set_cpu_sibling_map() et al. Just give the whole init path a hard
staring, and try to figure out the proper init sequence.

HTH.

P.S.: I'm currently travelling but will give your patch a run when I get
back, thanks.

--
Regards/Gruss,
Boris.

2012-02-14 00:13:16

by Kevin Winchester

[permalink] [raw]
Subject: [PATCH v2] x86: Move per cpu cpu_llc_shared_map to a field in struct cpuinfo_x86

Commit 141168c36cde ("x86: Simplify code by removing a !SMP #ifdefs from
'struct cpuinfo_x86'") caused the compilation error:

mce_amd.c:(.cpuinit.text+0x4723): undefined reference to 'cpu_llc_shared_map'

by removing an #ifdef CONFIG_SMP around a block containing a reference
to cpu_llc_shared_map. Rather than replace the #ifdef, move
cpu_llc_shared_map to be a new cpumask_t field llc_shared_map in
struct cpuinfo_x86 and adjust all references to cpu_llc_shared_map.

The size effects on various kernels are as follows:

text data bss dec hex filename
5281572 513296 1044480 6839348 685c34 vmlinux.up
5281572 513296 1044480 6839348 685c34 vmlinux.up.patched
5548860 516792 1110016 7175668 6d7df4 vmlinux.smp.2
5548837 516792 1110016 7175645 6d7ddd vmlinux.smp.2.patched
5595965 706840 1310720 7613525 742c55 vmlinux.smp.max
5595876 707880 1310720 7614476 74300c vmlinux.smp.max.patched

It can be seen that this change has no effect on UP, a minor effect for
SMP with Max 2 CPUs, and a more substantial but still not overly large
effect for MAXSMP.

Signed-off-by: Kevin Winchester <[email protected]>
---

I'm still wondering if I should I give the same treatment to:

cpu_sibling_map
cpu_core_map
cpu_llc_id
cpu_number

or is that going too far?

arch/x86/include/asm/processor.h | 1 +
arch/x86/include/asm/smp.h | 6 ------
arch/x86/kernel/cpu/intel_cacheinfo.c | 4 ++--
arch/x86/kernel/cpu/mcheck/mce_amd.c | 7 ++++---
arch/x86/kernel/smpboot.c | 15 ++++++---------
arch/x86/xen/smp.c | 1 -
6 files changed, 13 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index aa9088c..dde36b4 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -110,6 +110,7 @@ struct cpuinfo_x86 {
/* Index into per_cpu list: */
u16 cpu_index;
u32 microcode;
+ cpumask_t llc_shared_map;
} __attribute__((__aligned__(SMP_CACHE_BYTES)));

#define X86_VENDOR_INTEL 0
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 0434c40..f7599d0 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -34,7 +34,6 @@ static inline bool cpu_has_ht_siblings(void)
DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map);
DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
/* cpus sharing the last level cache: */
-DECLARE_PER_CPU(cpumask_var_t, cpu_llc_shared_map);
DECLARE_PER_CPU(u16, cpu_llc_id);
DECLARE_PER_CPU(int, cpu_number);

@@ -48,11 +47,6 @@ static inline struct cpumask *cpu_core_mask(int cpu)
return per_cpu(cpu_core_map, cpu);
}

-static inline struct cpumask *cpu_llc_shared_mask(int cpu)
-{
- return per_cpu(cpu_llc_shared_map, cpu);
-}
-
DECLARE_EARLY_PER_CPU(u16, x86_cpu_to_apicid);
DECLARE_EARLY_PER_CPU(u16, x86_bios_cpu_apicid);
#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86_32)
diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c
index 73d08ed..a9cd551 100644
--- a/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -734,11 +734,11 @@ static int __cpuinit cache_shared_amd_cpu_map_setup(unsigned int cpu, int index)
ret = 0;
if (index == 3) {
ret = 1;
- for_each_cpu(i, cpu_llc_shared_mask(cpu)) {
+ for_each_cpu(i, &c->llc_shared_map) {
if (!per_cpu(ici_cpuid4_info, i))
continue;
this_leaf = CPUID4_INFO_IDX(i, index);
- for_each_cpu(sibling, cpu_llc_shared_mask(cpu)) {
+ for_each_cpu(sibling, &c->llc_shared_map) {
if (!cpu_online(sibling))
continue;
set_bit(sibling, this_leaf->shared_cpu_map);
diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index 786e76a..5e0ec2c 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -525,11 +525,12 @@ static __cpuinit int threshold_create_bank(unsigned int cpu, unsigned int bank)
struct threshold_bank *b = NULL;
struct device *dev = mce_device[cpu];
char name[32];
+ struct cpuinfo_x86 *c = &cpu_data(cpu);

sprintf(name, "threshold_bank%i", bank);

- if (cpu_data(cpu).cpu_core_id && shared_bank[bank]) { /* symlink */
- i = cpumask_first(cpu_llc_shared_mask(cpu));
+ if (c->cpu_core_id && shared_bank[bank]) { /* symlink */
+ i = cpumask_first(&c->llc_shared_map);

/* first core not up yet */
if (cpu_data(i).cpu_core_id)
@@ -548,7 +549,7 @@ static __cpuinit int threshold_create_bank(unsigned int cpu, unsigned int bank)
if (err)
goto out;

- cpumask_copy(b->cpus, cpu_llc_shared_mask(cpu));
+ cpumask_copy(b->cpus, &c->llc_shared_map);
per_cpu(threshold_banks, cpu)[bank] = b;

goto out;
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 66d250c..4451a3a 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -127,8 +127,6 @@ EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
DEFINE_PER_CPU(cpumask_var_t, cpu_core_map);
EXPORT_PER_CPU_SYMBOL(cpu_core_map);

-DEFINE_PER_CPU(cpumask_var_t, cpu_llc_shared_map);
-
/* Per CPU bogomips and other parameters */
DEFINE_PER_CPU_SHARED_ALIGNED(struct cpuinfo_x86, cpu_info);
EXPORT_PER_CPU_SYMBOL(cpu_info);
@@ -337,8 +335,8 @@ static void __cpuinit link_thread_siblings(int cpu1, int cpu2)
cpumask_set_cpu(cpu2, cpu_sibling_mask(cpu1));
cpumask_set_cpu(cpu1, cpu_core_mask(cpu2));
cpumask_set_cpu(cpu2, cpu_core_mask(cpu1));
- cpumask_set_cpu(cpu1, cpu_llc_shared_mask(cpu2));
- cpumask_set_cpu(cpu2, cpu_llc_shared_mask(cpu1));
+ cpumask_set_cpu(cpu1, &cpu_data(cpu2).llc_shared_map);
+ cpumask_set_cpu(cpu2, &cpu_data(cpu1).llc_shared_map);
}


@@ -367,7 +365,7 @@ void __cpuinit set_cpu_sibling_map(int cpu)
cpumask_set_cpu(cpu, cpu_sibling_mask(cpu));
}

- cpumask_set_cpu(cpu, cpu_llc_shared_mask(cpu));
+ cpumask_set_cpu(cpu, &c->llc_shared_map);

if (__this_cpu_read(cpu_info.x86_max_cores) == 1) {
cpumask_copy(cpu_core_mask(cpu), cpu_sibling_mask(cpu));
@@ -378,8 +376,8 @@ void __cpuinit set_cpu_sibling_map(int cpu)
for_each_cpu(i, cpu_sibling_setup_mask) {
if (per_cpu(cpu_llc_id, cpu) != BAD_APICID &&
per_cpu(cpu_llc_id, cpu) == per_cpu(cpu_llc_id, i)) {
- cpumask_set_cpu(i, cpu_llc_shared_mask(cpu));
- cpumask_set_cpu(cpu, cpu_llc_shared_mask(i));
+ cpumask_set_cpu(i, &c->llc_shared_map);
+ cpumask_set_cpu(cpu, &cpu_data(i).llc_shared_map);
}
if (c->phys_proc_id == cpu_data(i).phys_proc_id) {
cpumask_set_cpu(i, cpu_core_mask(cpu));
@@ -418,7 +416,7 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
!(cpu_has(c, X86_FEATURE_AMD_DCM)))
return cpu_core_mask(cpu);
else
- return cpu_llc_shared_mask(cpu);
+ return &c->llc_shared_map;
}

static void impress_friends(void)
@@ -1053,7 +1051,6 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
for_each_possible_cpu(i) {
zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
- zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), GFP_KERNEL);
}
set_cpu_sibling_map(0);

diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 041d4fe..a898ed5 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -225,7 +225,6 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
for_each_possible_cpu(i) {
zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
- zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), GFP_KERNEL);
}
set_cpu_sibling_map(0);

--
1.7.9

2012-02-17 11:56:42

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v2] x86: Move per cpu cpu_llc_shared_map to a field in struct cpuinfo_x86


* Kevin Winchester <[email protected]> wrote:

> Commit 141168c36cde ("x86: Simplify code by removing a !SMP #ifdefs from
> 'struct cpuinfo_x86'") caused the compilation error:
>
> mce_amd.c:(.cpuinit.text+0x4723): undefined reference to 'cpu_llc_shared_map'
>
> by removing an #ifdef CONFIG_SMP around a block containing a reference
> to cpu_llc_shared_map. Rather than replace the #ifdef, move
> cpu_llc_shared_map to be a new cpumask_t field llc_shared_map in
> struct cpuinfo_x86 and adjust all references to cpu_llc_shared_map.
>
> The size effects on various kernels are as follows:
>
> text data bss dec hex filename
> 5281572 513296 1044480 6839348 685c34 vmlinux.up
> 5281572 513296 1044480 6839348 685c34 vmlinux.up.patched
> 5548860 516792 1110016 7175668 6d7df4 vmlinux.smp.2
> 5548837 516792 1110016 7175645 6d7ddd vmlinux.smp.2.patched
> 5595965 706840 1310720 7613525 742c55 vmlinux.smp.max
> 5595876 707880 1310720 7614476 74300c vmlinux.smp.max.patched
>
> It can be seen that this change has no effect on UP, a minor effect for
> SMP with Max 2 CPUs, and a more substantial but still not overly large
> effect for MAXSMP.
>
> Signed-off-by: Kevin Winchester <[email protected]>
> ---
>
> I'm still wondering if I should I give the same treatment to:
>
> cpu_sibling_map
> cpu_core_map
> cpu_llc_id
> cpu_number
>
> or is that going too far?
>
> arch/x86/include/asm/processor.h | 1 +
> arch/x86/include/asm/smp.h | 6 ------
> arch/x86/kernel/cpu/intel_cacheinfo.c | 4 ++--
> arch/x86/kernel/cpu/mcheck/mce_amd.c | 7 ++++---
> arch/x86/kernel/smpboot.c | 15 ++++++---------
> arch/x86/xen/smp.c | 1 -
> 6 files changed, 13 insertions(+), 21 deletions(-)

Yeah, I'd definitely give them the same treatment.

Would you like to update your series? I'd suggest you keep patch
#1 in place, as it's already probably reasonably well tested.

Thanks,

Ingo

2012-02-17 13:12:05

by Kevin Winchester

[permalink] [raw]
Subject: Re: [PATCH v2] x86: Move per cpu cpu_llc_shared_map to a field in struct cpuinfo_x86

On 17 February 2012 07:56, Ingo Molnar <[email protected]> wrote:
>
>>
>> I'm still wondering if I should I give the same treatment to:
>>
>> cpu_sibling_map
>> cpu_core_map
>> cpu_llc_id
>> cpu_number
>>
>
> Yeah, I'd definitely give them the same treatment.
>
> Would you like to update your series? I'd suggest you keep patch
> #1 in place, as it's already probably reasonably well tested.
>

Sure, I'll leave that patch as #1, and add an additional patch per
field. It will take me a week or two to find the time to get it all
together, but I'll keep working away at it.

--
Kevin Winchester

2012-02-21 02:06:31

by Kevin Winchester

[permalink] [raw]
Subject: [PATCH 2/5] x86: Move per cpu cpu_llc_id to a field in struct cpuinfo_x86

This simplifies the various code paths using this field as it
groups the per-cpu data together.

Signed-off-by: Kevin Winchester <[email protected]>
---
arch/x86/include/asm/processor.h | 2 ++
arch/x86/include/asm/smp.h | 2 --
arch/x86/kernel/apic/apic_numachip.c | 2 +-
arch/x86/kernel/cpu/amd.c | 14 ++++----------
arch/x86/kernel/cpu/common.c | 1 +
arch/x86/kernel/cpu/intel_cacheinfo.c | 11 ++---------
arch/x86/kernel/smpboot.c | 18 ++++++++----------
7 files changed, 18 insertions(+), 32 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 0c8b574..2d4eb6f 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -111,6 +111,8 @@ struct cpuinfo_x86 {
u16 cpu_index;
u32 microcode;
cpumask_t llc_shared_map;
+ /* cpus sharing the last level cache: */
+ u16 llc_id;
} __attribute__((__aligned__(SMP_CACHE_BYTES)));

#define X86_VENDOR_INTEL 0
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index f7599d0..40d1c96 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -33,8 +33,6 @@ static inline bool cpu_has_ht_siblings(void)

DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map);
DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
-/* cpus sharing the last level cache: */
-DECLARE_PER_CPU(u16, cpu_llc_id);
DECLARE_PER_CPU(int, cpu_number);

static inline struct cpumask *cpu_sibling_mask(int cpu)
diff --git a/arch/x86/kernel/apic/apic_numachip.c b/arch/x86/kernel/apic/apic_numachip.c
index 09d3d8c..73c46cf 100644
--- a/arch/x86/kernel/apic/apic_numachip.c
+++ b/arch/x86/kernel/apic/apic_numachip.c
@@ -202,7 +202,7 @@ static void __init map_csrs(void)
static void fixup_cpu_id(struct cpuinfo_x86 *c, int node)
{
c->phys_proc_id = node;
- per_cpu(cpu_llc_id, smp_processor_id()) = node;
+ c->llc_id = node;
}

static int __init numachip_system_init(void)
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 0a44b90..1cd9d51 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -268,7 +268,6 @@ static void __cpuinit amd_get_topology(struct cpuinfo_x86 *c)
{
u32 nodes, cores_per_cu = 1;
u8 node_id;
- int cpu = smp_processor_id();

/* get information required for multi-node processors */
if (cpu_has(c, X86_FEATURE_TOPOEXT)) {
@@ -301,7 +300,7 @@ static void __cpuinit amd_get_topology(struct cpuinfo_x86 *c)
cus_per_node = cores_per_node / cores_per_cu;

/* store NodeID, use llc_shared_map to store sibling info */
- per_cpu(cpu_llc_id, cpu) = node_id;
+ c->llc_id = node_id;

/* core id has to be in the [0 .. cores_per_node - 1] range */
c->cpu_core_id %= cores_per_node;
@@ -318,7 +317,6 @@ static void __cpuinit amd_detect_cmp(struct cpuinfo_x86 *c)
{
#ifdef CONFIG_X86_HT
unsigned bits;
- int cpu = smp_processor_id();

bits = c->x86_coreid_bits;
/* Low order bits define the core id (index of core in socket) */
@@ -326,18 +324,14 @@ static void __cpuinit amd_detect_cmp(struct cpuinfo_x86 *c)
/* Convert the initial APIC ID into the socket ID */
c->phys_proc_id = c->initial_apicid >> bits;
/* use socket ID also for last level cache */
- per_cpu(cpu_llc_id, cpu) = c->phys_proc_id;
+ c->llc_id = c->phys_proc_id;
amd_get_topology(c);
#endif
}

int amd_get_nb_id(int cpu)
{
- int id = 0;
-#ifdef CONFIG_SMP
- id = per_cpu(cpu_llc_id, cpu);
-#endif
- return id;
+ return cpu_data(cpu).llc_id;
}
EXPORT_SYMBOL_GPL(amd_get_nb_id);

@@ -350,7 +344,7 @@ static void __cpuinit srat_detect_node(struct cpuinfo_x86 *c)

node = numa_cpu_node(cpu);
if (node == NUMA_NO_NODE)
- node = per_cpu(cpu_llc_id, cpu);
+ node = c->llc_id;

/*
* If core numbers are inconsistent, it's likely a multi-fabric platform,
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 8b6a3bb..7052410 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -787,6 +787,7 @@ static void __cpuinit identify_cpu(struct cpuinfo_x86 *c)
c->x86_model_id[0] = '\0'; /* Unset */
c->x86_max_cores = 1;
c->x86_coreid_bits = 0;
+ c->llc_id = BAD_APICID;
#ifdef CONFIG_X86_64
c->x86_clflush_size = 64;
c->x86_phys_bits = 36;
diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c
index a9cd551..5ddd6ef 100644
--- a/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -579,9 +579,6 @@ unsigned int __cpuinit init_intel_cacheinfo(struct cpuinfo_x86 *c)
unsigned int new_l1d = 0, new_l1i = 0; /* Cache sizes from cpuid(4) */
unsigned int new_l2 = 0, new_l3 = 0, i; /* Cache sizes from cpuid(4) */
unsigned int l2_id = 0, l3_id = 0, num_threads_sharing, index_msb;
-#ifdef CONFIG_X86_HT
- unsigned int cpu = c->cpu_index;
-#endif

if (c->cpuid_level > 3) {
static int is_initialized;
@@ -700,16 +697,12 @@ unsigned int __cpuinit init_intel_cacheinfo(struct cpuinfo_x86 *c)

if (new_l2) {
l2 = new_l2;
-#ifdef CONFIG_X86_HT
- per_cpu(cpu_llc_id, cpu) = l2_id;
-#endif
+ c->llc_id = l2_id;
}

if (new_l3) {
l3 = new_l3;
-#ifdef CONFIG_X86_HT
- per_cpu(cpu_llc_id, cpu) = l3_id;
-#endif
+ c->llc_id = l3_id;
}

c->x86_cache_size = l3 ? l3 : (l2 ? l2 : (l1i+l1d));
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 1c20aa2..fb2ef30 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -116,9 +116,6 @@ static struct task_struct *idle_thread_array[NR_CPUS] __cpuinitdata ;
int smp_num_siblings = 1;
EXPORT_SYMBOL(smp_num_siblings);

-/* Last level cache ID of each logical CPU */
-DEFINE_PER_CPU(u16, cpu_llc_id) = BAD_APICID;
-
/* representing HT siblings of each logical CPU */
DEFINE_PER_CPU(cpumask_var_t, cpu_sibling_map);
EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
@@ -353,7 +350,7 @@ void __cpuinit set_cpu_sibling_map(int cpu)

if (cpu_has(c, X86_FEATURE_TOPOEXT)) {
if (c->phys_proc_id == o->phys_proc_id &&
- per_cpu(cpu_llc_id, cpu) == per_cpu(cpu_llc_id, i) &&
+ c->llc_id == o->llc_id &&
c->compute_unit_id == o->compute_unit_id)
link_thread_siblings(cpu, i);
} else if (c->phys_proc_id == o->phys_proc_id &&
@@ -374,12 +371,13 @@ void __cpuinit set_cpu_sibling_map(int cpu)
}

for_each_cpu(i, cpu_sibling_setup_mask) {
- if (per_cpu(cpu_llc_id, cpu) != BAD_APICID &&
- per_cpu(cpu_llc_id, cpu) == per_cpu(cpu_llc_id, i)) {
+ struct cpuinfo_x86 *o = &cpu_data(i);
+
+ if (c->llc_id != BAD_APICID && c->llc_id == o->llc_id) {
cpumask_set_cpu(i, &c->llc_shared_map);
- cpumask_set_cpu(cpu, &cpu_data(i).llc_shared_map);
+ cpumask_set_cpu(cpu, &o->llc_shared_map);
}
- if (c->phys_proc_id == cpu_data(i).phys_proc_id) {
+ if (c->phys_proc_id == o->phys_proc_id) {
cpumask_set_cpu(i, cpu_core_mask(cpu));
cpumask_set_cpu(cpu, cpu_core_mask(i));
/*
@@ -397,9 +395,9 @@ void __cpuinit set_cpu_sibling_map(int cpu)
* the other cpus in this package
*/
if (i != cpu)
- cpu_data(i).booted_cores++;
+ o->booted_cores++;
} else if (i != cpu && !c->booted_cores)
- c->booted_cores = cpu_data(i).booted_cores;
+ c->booted_cores = o->booted_cores;
}
}
}
--
1.7.9

2012-02-21 02:06:29

by Kevin Winchester

[permalink] [raw]
Subject: [PATCH 1/5] x86: Move per cpu cpu_llc_shared_map to a field in struct cpuinfo_x86

Commit 141168c36cde ("x86: Simplify code by removing a !SMP #ifdefs from
'struct cpuinfo_x86'") caused the compilation error:

mce_amd.c:(.cpuinit.text+0x4723): undefined reference to 'cpu_llc_shared_map'

by removing an #ifdef CONFIG_SMP around a block containing a reference
to cpu_llc_shared_map. Rather than replace the #ifdef, move
cpu_llc_shared_map to be a new cpumask_t field llc_shared_map in
struct cpuinfo_x86 and adjust all references to cpu_llc_shared_map.

The size effects on various kernels are as follows:

text data bss dec hex filename
5281572 513296 1044480 6839348 685c34 vmlinux.up
5281572 513296 1044480 6839348 685c34 vmlinux.up.patched
5548860 516792 1110016 7175668 6d7df4 vmlinux.smp.2
5548837 516792 1110016 7175645 6d7ddd vmlinux.smp.2.patched
5595965 706840 1310720 7613525 742c55 vmlinux.smp.max
5595876 707880 1310720 7614476 74300c vmlinux.smp.max.patched

It can be seen that this change has no effect on UP, a minor effect for
SMP with Max 2 CPUs, and a more substantial but still not overly large
effect for MAXSMP.

Signed-off-by: Kevin Winchester <[email protected]>
---
arch/x86/include/asm/processor.h | 1 +
arch/x86/include/asm/smp.h | 6 ------
arch/x86/kernel/cpu/intel_cacheinfo.c | 4 ++--
arch/x86/kernel/cpu/mcheck/mce_amd.c | 7 ++++---
arch/x86/kernel/smpboot.c | 15 ++++++---------
arch/x86/xen/smp.c | 1 -
6 files changed, 13 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 4b81258..0c8b574 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -110,6 +110,7 @@ struct cpuinfo_x86 {
/* Index into per_cpu list: */
u16 cpu_index;
u32 microcode;
+ cpumask_t llc_shared_map;
} __attribute__((__aligned__(SMP_CACHE_BYTES)));

#define X86_VENDOR_INTEL 0
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 0434c40..f7599d0 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -34,7 +34,6 @@ static inline bool cpu_has_ht_siblings(void)
DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map);
DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
/* cpus sharing the last level cache: */
-DECLARE_PER_CPU(cpumask_var_t, cpu_llc_shared_map);
DECLARE_PER_CPU(u16, cpu_llc_id);
DECLARE_PER_CPU(int, cpu_number);

@@ -48,11 +47,6 @@ static inline struct cpumask *cpu_core_mask(int cpu)
return per_cpu(cpu_core_map, cpu);
}

-static inline struct cpumask *cpu_llc_shared_mask(int cpu)
-{
- return per_cpu(cpu_llc_shared_map, cpu);
-}
-
DECLARE_EARLY_PER_CPU(u16, x86_cpu_to_apicid);
DECLARE_EARLY_PER_CPU(u16, x86_bios_cpu_apicid);
#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86_32)
diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c
index 73d08ed..a9cd551 100644
--- a/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -734,11 +734,11 @@ static int __cpuinit cache_shared_amd_cpu_map_setup(unsigned int cpu, int index)
ret = 0;
if (index == 3) {
ret = 1;
- for_each_cpu(i, cpu_llc_shared_mask(cpu)) {
+ for_each_cpu(i, &c->llc_shared_map) {
if (!per_cpu(ici_cpuid4_info, i))
continue;
this_leaf = CPUID4_INFO_IDX(i, index);
- for_each_cpu(sibling, cpu_llc_shared_mask(cpu)) {
+ for_each_cpu(sibling, &c->llc_shared_map) {
if (!cpu_online(sibling))
continue;
set_bit(sibling, this_leaf->shared_cpu_map);
diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index 786e76a..5e0ec2c 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -525,11 +525,12 @@ static __cpuinit int threshold_create_bank(unsigned int cpu, unsigned int bank)
struct threshold_bank *b = NULL;
struct device *dev = mce_device[cpu];
char name[32];
+ struct cpuinfo_x86 *c = &cpu_data(cpu);

sprintf(name, "threshold_bank%i", bank);

- if (cpu_data(cpu).cpu_core_id && shared_bank[bank]) { /* symlink */
- i = cpumask_first(cpu_llc_shared_mask(cpu));
+ if (c->cpu_core_id && shared_bank[bank]) { /* symlink */
+ i = cpumask_first(&c->llc_shared_map);

/* first core not up yet */
if (cpu_data(i).cpu_core_id)
@@ -548,7 +549,7 @@ static __cpuinit int threshold_create_bank(unsigned int cpu, unsigned int bank)
if (err)
goto out;

- cpumask_copy(b->cpus, cpu_llc_shared_mask(cpu));
+ cpumask_copy(b->cpus, &c->llc_shared_map);
per_cpu(threshold_banks, cpu)[bank] = b;

goto out;
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 257049d..1c20aa2 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -127,8 +127,6 @@ EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
DEFINE_PER_CPU(cpumask_var_t, cpu_core_map);
EXPORT_PER_CPU_SYMBOL(cpu_core_map);

-DEFINE_PER_CPU(cpumask_var_t, cpu_llc_shared_map);
-
/* Per CPU bogomips and other parameters */
DEFINE_PER_CPU_SHARED_ALIGNED(struct cpuinfo_x86, cpu_info);
EXPORT_PER_CPU_SYMBOL(cpu_info);
@@ -337,8 +335,8 @@ static void __cpuinit link_thread_siblings(int cpu1, int cpu2)
cpumask_set_cpu(cpu2, cpu_sibling_mask(cpu1));
cpumask_set_cpu(cpu1, cpu_core_mask(cpu2));
cpumask_set_cpu(cpu2, cpu_core_mask(cpu1));
- cpumask_set_cpu(cpu1, cpu_llc_shared_mask(cpu2));
- cpumask_set_cpu(cpu2, cpu_llc_shared_mask(cpu1));
+ cpumask_set_cpu(cpu1, &cpu_data(cpu2).llc_shared_map);
+ cpumask_set_cpu(cpu2, &cpu_data(cpu1).llc_shared_map);
}


@@ -367,7 +365,7 @@ void __cpuinit set_cpu_sibling_map(int cpu)
cpumask_set_cpu(cpu, cpu_sibling_mask(cpu));
}

- cpumask_set_cpu(cpu, cpu_llc_shared_mask(cpu));
+ cpumask_set_cpu(cpu, &c->llc_shared_map);

if (__this_cpu_read(cpu_info.x86_max_cores) == 1) {
cpumask_copy(cpu_core_mask(cpu), cpu_sibling_mask(cpu));
@@ -378,8 +376,8 @@ void __cpuinit set_cpu_sibling_map(int cpu)
for_each_cpu(i, cpu_sibling_setup_mask) {
if (per_cpu(cpu_llc_id, cpu) != BAD_APICID &&
per_cpu(cpu_llc_id, cpu) == per_cpu(cpu_llc_id, i)) {
- cpumask_set_cpu(i, cpu_llc_shared_mask(cpu));
- cpumask_set_cpu(cpu, cpu_llc_shared_mask(i));
+ cpumask_set_cpu(i, &c->llc_shared_map);
+ cpumask_set_cpu(cpu, &cpu_data(i).llc_shared_map);
}
if (c->phys_proc_id == cpu_data(i).phys_proc_id) {
cpumask_set_cpu(i, cpu_core_mask(cpu));
@@ -418,7 +416,7 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
!(cpu_has(c, X86_FEATURE_AMD_DCM)))
return cpu_core_mask(cpu);
else
- return cpu_llc_shared_mask(cpu);
+ return &c->llc_shared_map;
}

static void impress_friends(void)
@@ -1054,7 +1052,6 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
for_each_possible_cpu(i) {
zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
- zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), GFP_KERNEL);
}
set_cpu_sibling_map(0);

diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 501d4e0..b9f7a86 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -225,7 +225,6 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
for_each_possible_cpu(i) {
zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
- zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), GFP_KERNEL);
}
set_cpu_sibling_map(0);

--
1.7.9

2012-02-21 02:06:47

by Kevin Winchester

[permalink] [raw]
Subject: [PATCH 5/5] x86: Remove #ifdef CONFIG_SMP sections by moving smp_num_siblings into common.c

smp_num_siblings was defined in arch/x86/kernel/smpboot.c, making it
necessary to wrap any UP relevant code referencing it with #ifdef
CONFIG_SMP.

Instead, move the definition to arch/x86/kernel/cpu/common.c, thus
making it available always.

Signed-off-by: Kevin Winchester <[email protected]>
---
arch/x86/include/asm/perf_event_p4.h | 9 +++------
arch/x86/include/asm/smp.h | 6 +-----
arch/x86/include/asm/topology.h | 4 +---
arch/x86/kernel/cpu/amd.c | 4 ----
arch/x86/kernel/cpu/common.c | 6 ++++--
arch/x86/kernel/cpu/proc.c | 5 ++---
arch/x86/kernel/cpu/topology.c | 2 --
arch/x86/kernel/process.c | 3 +--
arch/x86/kernel/smpboot.c | 4 ----
arch/x86/oprofile/nmi_int.c | 6 ------
arch/x86/oprofile/op_model_p4.c | 6 ------
11 files changed, 12 insertions(+), 43 deletions(-)

diff --git a/arch/x86/include/asm/perf_event_p4.h b/arch/x86/include/asm/perf_event_p4.h
index 29a65c2..0d50f33 100644
--- a/arch/x86/include/asm/perf_event_p4.h
+++ b/arch/x86/include/asm/perf_event_p4.h
@@ -8,6 +8,8 @@
#include <linux/cpu.h>
#include <linux/bitops.h>

+#include <asm/smp.h>
+
/*
* NetBurst has performance MSRs shared between
* threads if HT is turned on, ie for both logical
@@ -179,18 +181,13 @@ static inline u64 p4_clear_ht_bit(u64 config)

static inline int p4_ht_active(void)
{
-#ifdef CONFIG_SMP
return smp_num_siblings > 1;
-#endif
- return 0;
}

static inline int p4_ht_thread(int cpu)
{
-#ifdef CONFIG_SMP
if (smp_num_siblings == 2)
- return cpu != cpumask_first(&cpu_data(cpu).sibling_map));
-#endif
+ return cpu != cpumask_first(&cpu_data(cpu).sibling_map);
return 0;
}

diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 75aea4d..787127e 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -24,11 +24,7 @@ extern unsigned int num_processors;

static inline bool cpu_has_ht_siblings(void)
{
- bool has_siblings = false;
-#ifdef CONFIG_SMP
- has_siblings = cpu_has_ht && smp_num_siblings > 1;
-#endif
- return has_siblings;
+ return cpu_has_ht && smp_num_siblings > 1;
}

DECLARE_PER_CPU(int, cpu_number);
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 58438a1b..7250ad1 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -174,11 +174,9 @@ static inline void arch_fix_phys_package_id(int num, u32 slot)
struct pci_bus;
void x86_pci_root_bus_resources(int bus, struct list_head *resources);

-#ifdef CONFIG_SMP
#define mc_capable() ((boot_cpu_data.x86_max_cores > 1) && \
(cpumask_weight(&boot_cpu_data.core_map) != nr_cpu_ids))
-#define smt_capable() (smp_num_siblings > 1)
-#endif
+#define smt_capable() (smp_num_siblings > 1)

#ifdef CONFIG_NUMA
extern int get_mp_bus_to_node(int busnum);
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 1cd9d51..a8b46df 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -263,7 +263,6 @@ static int __cpuinit nearby_node(int apicid)
* Assumption: Number of cores in each internal node is the same.
* (2) AMD processors supporting compute units
*/
-#ifdef CONFIG_X86_HT
static void __cpuinit amd_get_topology(struct cpuinfo_x86 *c)
{
u32 nodes, cores_per_cu = 1;
@@ -307,7 +306,6 @@ static void __cpuinit amd_get_topology(struct cpuinfo_x86 *c)
c->compute_unit_id %= cus_per_node;
}
}
-#endif

/*
* On a AMD dual core setup the lower bits of the APIC id distingush the cores.
@@ -315,7 +313,6 @@ static void __cpuinit amd_get_topology(struct cpuinfo_x86 *c)
*/
static void __cpuinit amd_detect_cmp(struct cpuinfo_x86 *c)
{
-#ifdef CONFIG_X86_HT
unsigned bits;

bits = c->x86_coreid_bits;
@@ -326,7 +323,6 @@ static void __cpuinit amd_detect_cmp(struct cpuinfo_x86 *c)
/* use socket ID also for last level cache */
c->llc_id = c->phys_proc_id;
amd_get_topology(c);
-#endif
}

int amd_get_nb_id(int cpu)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 7052410..b2c6e3e 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -47,6 +47,10 @@ cpumask_var_t cpu_initialized_mask;
cpumask_var_t cpu_callout_mask;
cpumask_var_t cpu_callin_mask;

+/* Number of siblings per CPU package */
+int smp_num_siblings = 1;
+EXPORT_SYMBOL(smp_num_siblings);
+
/* representing cpus for which sibling maps can be computed */
cpumask_var_t cpu_sibling_setup_mask;

@@ -452,7 +456,6 @@ void __cpuinit cpu_detect_cache_sizes(struct cpuinfo_x86 *c)

void __cpuinit detect_ht(struct cpuinfo_x86 *c)
{
-#ifdef CONFIG_X86_HT
u32 eax, ebx, ecx, edx;
int index_msb, core_bits;
static bool printed;
@@ -498,7 +501,6 @@ out:
c->cpu_core_id);
printed = 1;
}
-#endif
}

static void __cpuinit get_cpu_vendor(struct cpuinfo_x86 *c)
diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
index e6e07c2..aef8b27 100644
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -1,16 +1,16 @@
-#include <linux/smp.h>
#include <linux/timex.h>
#include <linux/string.h>
#include <linux/seq_file.h>
#include <linux/cpufreq.h>

+#include <asm/smp.h>
+
/*
* Get CPU information for use by the procfs.
*/
static void show_cpuinfo_core(struct seq_file *m, struct cpuinfo_x86 *c,
unsigned int cpu)
{
-#ifdef CONFIG_SMP
if (c->x86_max_cores * smp_num_siblings > 1) {
seq_printf(m, "physical id\t: %d\n", c->phys_proc_id);
seq_printf(m, "siblings\t: %d\n", cpumask_weight(&c->core_map));
@@ -19,7 +19,6 @@ static void show_cpuinfo_core(struct seq_file *m, struct cpuinfo_x86 *c,
seq_printf(m, "apicid\t\t: %d\n", c->apicid);
seq_printf(m, "initial apicid\t: %d\n", c->initial_apicid);
}
-#endif
}

#ifdef CONFIG_X86_32
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 4397e98..d4ee471 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -28,7 +28,6 @@
*/
void __cpuinit detect_extended_topology(struct cpuinfo_x86 *c)
{
-#ifdef CONFIG_SMP
unsigned int eax, ebx, ecx, edx, sub_index;
unsigned int ht_mask_width, core_plus_mask_width;
unsigned int core_select_mask, core_level_siblings;
@@ -95,5 +94,4 @@ void __cpuinit detect_extended_topology(struct cpuinfo_x86 *c)
printed = 1;
}
return;
-#endif
}
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 44eefde..d531cd2 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -586,12 +586,11 @@ static void amd_e400_idle(void)

void __cpuinit select_idle_routine(const struct cpuinfo_x86 *c)
{
-#ifdef CONFIG_SMP
if (pm_idle == poll_idle && smp_num_siblings > 1) {
printk_once(KERN_WARNING "WARNING: polling idle and HT enabled,"
" performance may degrade.\n");
}
-#endif
+
if (pm_idle)
return;

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index de23378..a0a6b01 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -112,10 +112,6 @@ static struct task_struct *idle_thread_array[NR_CPUS] __cpuinitdata ;
#define set_idle_for_cpu(x, p) (idle_thread_array[(x)] = (p))
#endif

-/* Number of siblings per CPU package */
-int smp_num_siblings = 1;
-EXPORT_SYMBOL(smp_num_siblings);
-
/* Per CPU bogomips and other parameters */
DEFINE_PER_CPU_SHARED_ALIGNED(struct cpuinfo_x86, cpu_info);
EXPORT_PER_CPU_SYMBOL(cpu_info);
diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
index 26b8a85..346e7ac 100644
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -572,11 +572,6 @@ static int __init p4_init(char **cpu_type)
if (cpu_model > 6 || cpu_model == 5)
return 0;

-#ifndef CONFIG_SMP
- *cpu_type = "i386/p4";
- model = &op_p4_spec;
- return 1;
-#else
switch (smp_num_siblings) {
case 1:
*cpu_type = "i386/p4";
@@ -588,7 +583,6 @@ static int __init p4_init(char **cpu_type)
model = &op_p4_ht2_spec;
return 1;
}
-#endif

printk(KERN_INFO "oprofile: P4 HyperThreading detected with > 2 threads\n");
printk(KERN_INFO "oprofile: Reverting to timer mode.\n");
diff --git a/arch/x86/oprofile/op_model_p4.c b/arch/x86/oprofile/op_model_p4.c
index ae3503e..c6bcb22 100644
--- a/arch/x86/oprofile/op_model_p4.c
+++ b/arch/x86/oprofile/op_model_p4.c
@@ -42,21 +42,15 @@ static unsigned int num_controls = NUM_CONTROLS_NON_HT;
kernel boot-time. */
static inline void setup_num_counters(void)
{
-#ifdef CONFIG_SMP
if (smp_num_siblings == 2) {
num_counters = NUM_COUNTERS_HT2;
num_controls = NUM_CONTROLS_HT2;
}
-#endif
}

static inline int addr_increment(void)
{
-#ifdef CONFIG_SMP
return smp_num_siblings == 2 ? 2 : 1;
-#else
- return 1;
-#endif
}


--
1.7.9

2012-02-21 02:07:12

by Kevin Winchester

[permalink] [raw]
Subject: [PATCH 4/5] x86: Move per cpu cpu_core_map to a field in struct cpuinfo_x86

This simplifies the various code paths using this field as it
groups the per-cpu data together.

Signed-off-by: Kevin Winchester <[email protected]>
---
arch/x86/include/asm/processor.h | 2 ++
arch/x86/include/asm/smp.h | 6 ------
arch/x86/include/asm/topology.h | 4 ++--
arch/x86/kernel/cpu/proc.c | 3 +--
arch/x86/kernel/smpboot.c | 35 ++++++++++++++---------------------
arch/x86/xen/smp.c | 4 ----
drivers/cpufreq/acpi-cpufreq.c | 2 +-
drivers/cpufreq/powernow-k8.c | 13 +++----------
8 files changed, 23 insertions(+), 46 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 38de3aa..53f7a8c 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -115,6 +115,8 @@ struct cpuinfo_x86 {
u16 llc_id;
/* representing HT siblings of each logical CPU */
cpumask_t sibling_map;
+ /* representing HT and core siblings of each logical CPU */
+ cpumask_t core_map;
} __attribute__((__aligned__(SMP_CACHE_BYTES)));

#define X86_VENDOR_INTEL 0
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index b5e7cd2..75aea4d 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -31,14 +31,8 @@ static inline bool cpu_has_ht_siblings(void)
return has_siblings;
}

-DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
DECLARE_PER_CPU(int, cpu_number);

-static inline struct cpumask *cpu_core_mask(int cpu)
-{
- return per_cpu(cpu_core_map, cpu);
-}
-
DECLARE_EARLY_PER_CPU(u16, x86_cpu_to_apicid);
DECLARE_EARLY_PER_CPU(u16, x86_bios_cpu_apicid);
#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86_32)
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 5297acbf..58438a1b 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -160,7 +160,7 @@ extern const struct cpumask *cpu_coregroup_mask(int cpu);
#ifdef ENABLE_TOPO_DEFINES
#define topology_physical_package_id(cpu) (cpu_data(cpu).phys_proc_id)
#define topology_core_id(cpu) (cpu_data(cpu).cpu_core_id)
-#define topology_core_cpumask(cpu) (per_cpu(cpu_core_map, cpu))
+#define topology_core_cpumask(cpu) (&cpu_data(cpu).core_map)
#define topology_thread_cpumask(cpu) (&cpu_data(cpu).sibling_map)

/* indicates that pointers to the topology cpumask_t maps are valid */
@@ -176,7 +176,7 @@ void x86_pci_root_bus_resources(int bus, struct list_head *resources);

#ifdef CONFIG_SMP
#define mc_capable() ((boot_cpu_data.x86_max_cores > 1) && \
- (cpumask_weight(cpu_core_mask(0)) != nr_cpu_ids))
+ (cpumask_weight(&boot_cpu_data.core_map) != nr_cpu_ids))
#define smt_capable() (smp_num_siblings > 1)
#endif

diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
index 8022c66..e6e07c2 100644
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -13,8 +13,7 @@ static void show_cpuinfo_core(struct seq_file *m, struct cpuinfo_x86 *c,
#ifdef CONFIG_SMP
if (c->x86_max_cores * smp_num_siblings > 1) {
seq_printf(m, "physical id\t: %d\n", c->phys_proc_id);
- seq_printf(m, "siblings\t: %d\n",
- cpumask_weight(cpu_core_mask(cpu)));
+ seq_printf(m, "siblings\t: %d\n", cpumask_weight(&c->core_map));
seq_printf(m, "core id\t\t: %d\n", c->cpu_core_id);
seq_printf(m, "cpu cores\t: %d\n", c->booted_cores);
seq_printf(m, "apicid\t\t: %d\n", c->apicid);
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 991d17b..de23378 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -116,10 +116,6 @@ static struct task_struct *idle_thread_array[NR_CPUS] __cpuinitdata ;
int smp_num_siblings = 1;
EXPORT_SYMBOL(smp_num_siblings);

-/* representing HT and core siblings of each logical CPU */
-DEFINE_PER_CPU(cpumask_var_t, cpu_core_map);
-EXPORT_PER_CPU_SYMBOL(cpu_core_map);
-
/* Per CPU bogomips and other parameters */
DEFINE_PER_CPU_SHARED_ALIGNED(struct cpuinfo_x86, cpu_info);
EXPORT_PER_CPU_SYMBOL(cpu_info);
@@ -326,8 +322,8 @@ static void __cpuinit link_thread_siblings(int cpu1, int cpu2)
{
cpumask_set_cpu(cpu1, &cpu_data(cpu2).sibling_map);
cpumask_set_cpu(cpu2, &cpu_data(cpu1).sibling_map);
- cpumask_set_cpu(cpu1, cpu_core_mask(cpu2));
- cpumask_set_cpu(cpu2, cpu_core_mask(cpu1));
+ cpumask_set_cpu(cpu1, &cpu_data(cpu2).core_map);
+ cpumask_set_cpu(cpu2, &cpu_data(cpu1).core_map);
cpumask_set_cpu(cpu1, &cpu_data(cpu2).llc_shared_map);
cpumask_set_cpu(cpu2, &cpu_data(cpu1).llc_shared_map);
}
@@ -361,7 +357,7 @@ void __cpuinit set_cpu_sibling_map(int cpu)
cpumask_set_cpu(cpu, &c->llc_shared_map);

if (__this_cpu_read(cpu_info.x86_max_cores) == 1) {
- cpumask_copy(cpu_core_mask(cpu), &c->sibling_map);
+ cpumask_copy(&c->core_map, &c->sibling_map);
c->booted_cores = 1;
return;
}
@@ -374,8 +370,8 @@ void __cpuinit set_cpu_sibling_map(int cpu)
cpumask_set_cpu(cpu, &o->llc_shared_map);
}
if (c->phys_proc_id == o->phys_proc_id) {
- cpumask_set_cpu(i, cpu_core_mask(cpu));
- cpumask_set_cpu(cpu, cpu_core_mask(i));
+ cpumask_set_cpu(i, &c->core_map);
+ cpumask_set_cpu(cpu, &o->core_map);
/*
* Does this new cpu bringup a new core?
*/
@@ -404,11 +400,11 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
struct cpuinfo_x86 *c = &cpu_data(cpu);
/*
* For perf, we return last level cache shared map.
- * And for power savings, we return cpu_core_map
+ * And for power savings, we return core map.
*/
if ((sched_mc_power_savings || sched_smt_power_savings) &&
!(cpu_has(c, X86_FEATURE_AMD_DCM)))
- return cpu_core_mask(cpu);
+ return &c->core_map;
else
return &c->llc_shared_map;
}
@@ -907,7 +903,7 @@ static __init void disable_smp(void)
else
physid_set_mask_of_physid(0, &phys_cpu_present_map);
cpumask_set_cpu(0, &cpu_data(0).sibling_map);
- cpumask_set_cpu(0, cpu_core_mask(0));
+ cpumask_set_cpu(0, &cpu_data(0).core_map);
}

/*
@@ -1030,8 +1026,6 @@ static void __init smp_cpu_index_default(void)
*/
void __init native_smp_prepare_cpus(unsigned int max_cpus)
{
- unsigned int i;
-
preempt_disable();
smp_cpu_index_default();

@@ -1043,9 +1037,6 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
mb();

current_thread_info()->cpu = 0; /* needed? */
- for_each_possible_cpu(i) {
- zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
- }
set_cpu_sibling_map(0);


@@ -1233,19 +1224,21 @@ static void remove_siblinginfo(int cpu)
int sibling;
struct cpuinfo_x86 *c = &cpu_data(cpu);

- for_each_cpu(sibling, cpu_core_mask(cpu)) {
- cpumask_clear_cpu(cpu, cpu_core_mask(sibling));
+ for_each_cpu(sibling, &c->core_map) {
+ struct cpuinfo_x86 *o = &cpu_data(sibling);
+
+ cpumask_clear_cpu(cpu, &o->core_map);
/*/
* last thread sibling in this cpu core going down
*/
if (cpumask_weight(&c->sibling_map) == 1)
- cpu_data(sibling).booted_cores--;
+ o->booted_cores--;
}

for_each_cpu(sibling, &c->sibling_map)
cpumask_clear_cpu(cpu, &c->sibling_map);
cpumask_clear(&c->sibling_map);
- cpumask_clear(cpu_core_mask(cpu));
+ cpumask_clear(&c->core_map);
c->phys_proc_id = 0;
c->cpu_core_id = 0;
cpumask_clear_cpu(cpu, cpu_sibling_setup_mask);
diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 00f32c0..d1792ec 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -206,7 +206,6 @@ static void __init xen_smp_prepare_boot_cpu(void)
static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
{
unsigned cpu;
- unsigned int i;

if (skip_ioapic_setup) {
char *m = (max_cpus == 0) ?
@@ -222,9 +221,6 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
smp_store_cpu_info(0);
cpu_data(0).x86_max_cores = 1;

- for_each_possible_cpu(i) {
- zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
- }
set_cpu_sibling_map(0);

if (xen_smp_intr_init(0))
diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
index 56c6c6b..152af7f 100644
--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -557,7 +557,7 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy *policy)
dmi_check_system(sw_any_bug_dmi_table);
if (bios_with_sw_any_bug && cpumask_weight(policy->cpus) == 1) {
policy->shared_type = CPUFREQ_SHARED_TYPE_ALL;
- cpumask_copy(policy->cpus, cpu_core_mask(cpu));
+ cpumask_copy(policy->cpus, &c->core_map);
}
#endif

diff --git a/drivers/cpufreq/powernow-k8.c b/drivers/cpufreq/powernow-k8.c
index 8f9b2ce..da0767c 100644
--- a/drivers/cpufreq/powernow-k8.c
+++ b/drivers/cpufreq/powernow-k8.c
@@ -66,13 +66,6 @@ static struct msr __percpu *msrs;

static struct cpufreq_driver cpufreq_amd64_driver;

-#ifndef CONFIG_SMP
-static inline const struct cpumask *cpu_core_mask(int cpu)
-{
- return cpumask_of(0);
-}
-#endif
-
/* Return a frequency in MHz, given an input fid */
static u32 find_freq_from_fid(u32 fid)
{
@@ -715,7 +708,7 @@ static int fill_powernow_table(struct powernow_k8_data *data,

pr_debug("cfid 0x%x, cvid 0x%x\n", data->currfid, data->currvid);
data->powernow_table = powernow_table;
- if (cpumask_first(cpu_core_mask(data->cpu)) == data->cpu)
+ if (cpumask_first(&cpu_data(data->cpu).core_map) == data->cpu)
print_basics(data);

for (j = 0; j < data->numps; j++)
@@ -884,7 +877,7 @@ static int powernow_k8_cpu_init_acpi(struct powernow_k8_data *data)
powernow_table[data->acpi_data.state_count].index = 0;
data->powernow_table = powernow_table;

- if (cpumask_first(cpu_core_mask(data->cpu)) == data->cpu)
+ if (cpumask_first(&cpu_data(data->cpu).core_map) == data->cpu)
print_basics(data);

/* notify BIOS that we exist */
@@ -1326,7 +1319,7 @@ static int __cpuinit powernowk8_cpu_init(struct cpufreq_policy *pol)
if (cpu_family == CPU_HW_PSTATE)
cpumask_copy(pol->cpus, cpumask_of(pol->cpu));
else
- cpumask_copy(pol->cpus, cpu_core_mask(pol->cpu));
+ cpumask_copy(pol->cpus, &c->core_map);
data->available_cores = pol->cpus;

if (cpu_family == CPU_HW_PSTATE)
--
1.7.9

2012-02-21 02:06:27

by Kevin Winchester

[permalink] [raw]
Subject: [PATCH 0/5] x86: Cleanup and simplify cpu-specific data

Various per-cpu fields are define in arch/x86/kernel/smpboot.c that are
basically equivalent to the cpu-specific data in struct cpuinfo_x86.
By moving these fields into the structure, a number of codepaths can be
simplified since they no longer need to care about those fields not
existing on !SMP builds.

The size effects on allno (UP) and allyes (MAX_SMP) kernels are as
follows:

text data bss dec hex filename
1586721 304864 506208 2397793 249661 vmlinux.allno
1588517 304928 505920 2399365 249c85 vmlinux.allno.after
84706053 13212311 42434560 140352924 85d9d9c vmlinux.allyes
84705333 13213799 42434560 140353692 85da09c vmlinux.allyes.afte

As can be seen, the kernels get slighly larger, but the code reduction/
simplification should be enough to compensate for it.

Kevin Winchester (5):
x86: Move per cpu cpu_llc_shared_map to a field in struct cpuinfo_x86
x86: Move per cpu cpu_llc_id to a field in struct cpuinfo_x86
x86: Move per cpu cpu_sibling_map to a field in struct cpuinfo_x86
x86: Move per cpu cpu_core_map to a field in struct cpuinfo_x86
x86: Remove #ifdef CONFIG_SMP sections by moving smp_num_siblings
into common.c

arch/x86/include/asm/perf_event_p4.h | 9 +--
arch/x86/include/asm/processor.h | 7 +++
arch/x86/include/asm/smp.h | 26 +---------
arch/x86/include/asm/topology.h | 10 ++--
arch/x86/kernel/apic/apic_numachip.c | 2 +-
arch/x86/kernel/cpu/amd.c | 18 ++-----
arch/x86/kernel/cpu/common.c | 7 ++-
arch/x86/kernel/cpu/intel_cacheinfo.c | 19 ++-----
arch/x86/kernel/cpu/mcheck/mce_amd.c | 7 ++-
arch/x86/kernel/cpu/proc.c | 8 +--
arch/x86/kernel/cpu/topology.c | 2 -
arch/x86/kernel/process.c | 3 +-
arch/x86/kernel/smpboot.c | 95 +++++++++++++--------------------
arch/x86/oprofile/nmi_int.c | 6 --
arch/x86/oprofile/op_model_p4.c | 11 +----
arch/x86/xen/smp.c | 6 --
drivers/cpufreq/acpi-cpufreq.c | 2 +-
drivers/cpufreq/p4-clockmod.c | 4 +-
drivers/cpufreq/powernow-k8.c | 13 +----
drivers/cpufreq/speedstep-ich.c | 6 +-
drivers/hwmon/coretemp.c | 6 +--
21 files changed, 86 insertions(+), 181 deletions(-)

--
1.7.9

2012-02-21 02:07:46

by Kevin Winchester

[permalink] [raw]
Subject: [PATCH 3/5] x86: Move per cpu cpu_sibling_map to a field in struct cpuinfo_x86

This simplifies the various code paths using this field as it
groups the per-cpu data together.

Signed-off-by: Kevin Winchester <[email protected]>
---
arch/x86/include/asm/perf_event_p4.h | 2 +-
arch/x86/include/asm/processor.h | 2 ++
arch/x86/include/asm/smp.h | 6 ------
arch/x86/include/asm/topology.h | 2 +-
arch/x86/kernel/cpu/intel_cacheinfo.c | 4 ++--
arch/x86/kernel/smpboot.c | 27 +++++++++++----------------
arch/x86/oprofile/op_model_p4.c | 5 +----
arch/x86/xen/smp.c | 1 -
drivers/cpufreq/p4-clockmod.c | 4 +---
drivers/cpufreq/speedstep-ich.c | 6 +++---
drivers/hwmon/coretemp.c | 6 +-----
11 files changed, 23 insertions(+), 42 deletions(-)

diff --git a/arch/x86/include/asm/perf_event_p4.h b/arch/x86/include/asm/perf_event_p4.h
index 4f7e67e..29a65c2 100644
--- a/arch/x86/include/asm/perf_event_p4.h
+++ b/arch/x86/include/asm/perf_event_p4.h
@@ -189,7 +189,7 @@ static inline int p4_ht_thread(int cpu)
{
#ifdef CONFIG_SMP
if (smp_num_siblings == 2)
- return cpu != cpumask_first(__get_cpu_var(cpu_sibling_map));
+ return cpu != cpumask_first(&cpu_data(cpu).sibling_map));
#endif
return 0;
}
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 2d4eb6f..38de3aa 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -113,6 +113,8 @@ struct cpuinfo_x86 {
cpumask_t llc_shared_map;
/* cpus sharing the last level cache: */
u16 llc_id;
+ /* representing HT siblings of each logical CPU */
+ cpumask_t sibling_map;
} __attribute__((__aligned__(SMP_CACHE_BYTES)));

#define X86_VENDOR_INTEL 0
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 40d1c96..b5e7cd2 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -31,15 +31,9 @@ static inline bool cpu_has_ht_siblings(void)
return has_siblings;
}

-DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map);
DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
DECLARE_PER_CPU(int, cpu_number);

-static inline struct cpumask *cpu_sibling_mask(int cpu)
-{
- return per_cpu(cpu_sibling_map, cpu);
-}
-
static inline struct cpumask *cpu_core_mask(int cpu)
{
return per_cpu(cpu_core_map, cpu);
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index b9676ae..5297acbf 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -161,7 +161,7 @@ extern const struct cpumask *cpu_coregroup_mask(int cpu);
#define topology_physical_package_id(cpu) (cpu_data(cpu).phys_proc_id)
#define topology_core_id(cpu) (cpu_data(cpu).cpu_core_id)
#define topology_core_cpumask(cpu) (per_cpu(cpu_core_map, cpu))
-#define topology_thread_cpumask(cpu) (per_cpu(cpu_sibling_map, cpu))
+#define topology_thread_cpumask(cpu) (&cpu_data(cpu).sibling_map)

/* indicates that pointers to the topology cpumask_t maps are valid */
#define arch_provides_topology_pointers yes
diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c
index 5ddd6ef..7787d33 100644
--- a/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -739,11 +739,11 @@ static int __cpuinit cache_shared_amd_cpu_map_setup(unsigned int cpu, int index)
}
} else if ((c->x86 == 0x15) && ((index == 1) || (index == 2))) {
ret = 1;
- for_each_cpu(i, cpu_sibling_mask(cpu)) {
+ for_each_cpu(i, &c->sibling_map) {
if (!per_cpu(ici_cpuid4_info, i))
continue;
this_leaf = CPUID4_INFO_IDX(i, index);
- for_each_cpu(sibling, cpu_sibling_mask(cpu)) {
+ for_each_cpu(sibling, &c->sibling_map) {
if (!cpu_online(sibling))
continue;
set_bit(sibling, this_leaf->shared_cpu_map);
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index fb2ef30..991d17b 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -116,10 +116,6 @@ static struct task_struct *idle_thread_array[NR_CPUS] __cpuinitdata ;
int smp_num_siblings = 1;
EXPORT_SYMBOL(smp_num_siblings);

-/* representing HT siblings of each logical CPU */
-DEFINE_PER_CPU(cpumask_var_t, cpu_sibling_map);
-EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
-
/* representing HT and core siblings of each logical CPU */
DEFINE_PER_CPU(cpumask_var_t, cpu_core_map);
EXPORT_PER_CPU_SYMBOL(cpu_core_map);
@@ -328,8 +324,8 @@ void __cpuinit smp_store_cpu_info(int id)

static void __cpuinit link_thread_siblings(int cpu1, int cpu2)
{
- cpumask_set_cpu(cpu1, cpu_sibling_mask(cpu2));
- cpumask_set_cpu(cpu2, cpu_sibling_mask(cpu1));
+ cpumask_set_cpu(cpu1, &cpu_data(cpu2).sibling_map);
+ cpumask_set_cpu(cpu2, &cpu_data(cpu1).sibling_map);
cpumask_set_cpu(cpu1, cpu_core_mask(cpu2));
cpumask_set_cpu(cpu2, cpu_core_mask(cpu1));
cpumask_set_cpu(cpu1, &cpu_data(cpu2).llc_shared_map);
@@ -359,13 +355,13 @@ void __cpuinit set_cpu_sibling_map(int cpu)
}
}
} else {
- cpumask_set_cpu(cpu, cpu_sibling_mask(cpu));
+ cpumask_set_cpu(cpu, &c->sibling_map);
}

cpumask_set_cpu(cpu, &c->llc_shared_map);

if (__this_cpu_read(cpu_info.x86_max_cores) == 1) {
- cpumask_copy(cpu_core_mask(cpu), cpu_sibling_mask(cpu));
+ cpumask_copy(cpu_core_mask(cpu), &c->sibling_map);
c->booted_cores = 1;
return;
}
@@ -383,12 +379,12 @@ void __cpuinit set_cpu_sibling_map(int cpu)
/*
* Does this new cpu bringup a new core?
*/
- if (cpumask_weight(cpu_sibling_mask(cpu)) == 1) {
+ if (cpumask_weight(&c->sibling_map) == 1) {
/*
* for each core in package, increment
* the booted_cores for this new cpu
*/
- if (cpumask_first(cpu_sibling_mask(i)) == i)
+ if (cpumask_first(&o->sibling_map) == i)
c->booted_cores++;
/*
* increment the core count for all
@@ -910,7 +906,7 @@ static __init void disable_smp(void)
physid_set_mask_of_physid(boot_cpu_physical_apicid, &phys_cpu_present_map);
else
physid_set_mask_of_physid(0, &phys_cpu_present_map);
- cpumask_set_cpu(0, cpu_sibling_mask(0));
+ cpumask_set_cpu(0, &cpu_data(0).sibling_map);
cpumask_set_cpu(0, cpu_core_mask(0));
}

@@ -1048,7 +1044,6 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)

current_thread_info()->cpu = 0; /* needed? */
for_each_possible_cpu(i) {
- zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
}
set_cpu_sibling_map(0);
@@ -1243,13 +1238,13 @@ static void remove_siblinginfo(int cpu)
/*/
* last thread sibling in this cpu core going down
*/
- if (cpumask_weight(cpu_sibling_mask(cpu)) == 1)
+ if (cpumask_weight(&c->sibling_map) == 1)
cpu_data(sibling).booted_cores--;
}

- for_each_cpu(sibling, cpu_sibling_mask(cpu))
- cpumask_clear_cpu(cpu, cpu_sibling_mask(sibling));
- cpumask_clear(cpu_sibling_mask(cpu));
+ for_each_cpu(sibling, &c->sibling_map)
+ cpumask_clear_cpu(cpu, &c->sibling_map);
+ cpumask_clear(&c->sibling_map);
cpumask_clear(cpu_core_mask(cpu));
c->phys_proc_id = 0;
c->cpu_core_id = 0;
diff --git a/arch/x86/oprofile/op_model_p4.c b/arch/x86/oprofile/op_model_p4.c
index 98ab130..ae3503e 100644
--- a/arch/x86/oprofile/op_model_p4.c
+++ b/arch/x86/oprofile/op_model_p4.c
@@ -370,11 +370,8 @@ static struct p4_event_binding p4_events[NUM_EVENTS] = {
or "odd" part of all the divided resources. */
static unsigned int get_stagger(void)
{
-#ifdef CONFIG_SMP
int cpu = smp_processor_id();
- return cpu != cpumask_first(__get_cpu_var(cpu_sibling_map));
-#endif
- return 0;
+ return cpu != cpumask_first(&cpu_data(cpu).sibling_map);
}


diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index b9f7a86..00f32c0 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -223,7 +223,6 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
cpu_data(0).x86_max_cores = 1;

for_each_possible_cpu(i) {
- zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
}
set_cpu_sibling_map(0);
diff --git a/drivers/cpufreq/p4-clockmod.c b/drivers/cpufreq/p4-clockmod.c
index 6be3e07..a14b9b0 100644
--- a/drivers/cpufreq/p4-clockmod.c
+++ b/drivers/cpufreq/p4-clockmod.c
@@ -203,9 +203,7 @@ static int cpufreq_p4_cpu_init(struct cpufreq_policy *policy)
int cpuid = 0;
unsigned int i;

-#ifdef CONFIG_SMP
- cpumask_copy(policy->cpus, cpu_sibling_mask(policy->cpu));
-#endif
+ cpumask_copy(policy->cpus, &c->sibling_map);

/* Errata workaround */
cpuid = (c->x86 << 8) | (c->x86_model << 4) | c->x86_mask;
diff --git a/drivers/cpufreq/speedstep-ich.c b/drivers/cpufreq/speedstep-ich.c
index a748ce7..630926a 100644
--- a/drivers/cpufreq/speedstep-ich.c
+++ b/drivers/cpufreq/speedstep-ich.c
@@ -326,14 +326,14 @@ static void get_freqs_on_cpu(void *_get_freqs)

static int speedstep_cpu_init(struct cpufreq_policy *policy)
{
+ struct cpuinfo_x86 *c = &cpu_data(policy->cpu);
int result;
unsigned int policy_cpu, speed;
struct get_freqs gf;

/* only run on CPU to be set, or on its sibling */
-#ifdef CONFIG_SMP
- cpumask_copy(policy->cpus, cpu_sibling_mask(policy->cpu));
-#endif
+ cpumask_copy(policy->cpus, c->sibling_map);
+
policy_cpu = cpumask_any_and(policy->cpus, cpu_online_mask);

/* detect low and high frequency and transition latency */
diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index a6c6ec3..fdf1590 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -61,11 +61,7 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
#define TO_CORE_ID(cpu) cpu_data(cpu).cpu_core_id
#define TO_ATTR_NO(cpu) (TO_CORE_ID(cpu) + BASE_SYSFS_ATTR_NO)

-#ifdef CONFIG_SMP
-#define for_each_sibling(i, cpu) for_each_cpu(i, cpu_sibling_mask(cpu))
-#else
-#define for_each_sibling(i, cpu) for (i = 0; false; )
-#endif
+#define for_each_sibling(i, cpu) for_each_cpu(i, &cpu_data(cpu).sibling_map)

/*
* Per-Core Temperature Data
--
1.7.9

2012-02-21 10:37:20

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 2/5] x86: Move per cpu cpu_llc_id to a field in struct cpuinfo_x86

On Mon, Feb 20, 2012 at 10:06:03PM -0400, Kevin Winchester wrote:
> This simplifies the various code paths using this field as it
> groups the per-cpu data together.
>
> Signed-off-by: Kevin Winchester <[email protected]>
> ---
> arch/x86/include/asm/processor.h | 2 ++
> arch/x86/include/asm/smp.h | 2 --
> arch/x86/kernel/apic/apic_numachip.c | 2 +-
> arch/x86/kernel/cpu/amd.c | 14 ++++----------
> arch/x86/kernel/cpu/common.c | 1 +
> arch/x86/kernel/cpu/intel_cacheinfo.c | 11 ++---------
> arch/x86/kernel/smpboot.c | 18 ++++++++----------
> 7 files changed, 18 insertions(+), 32 deletions(-)

This actually makes the code even a bit easier on the eyes.

Acked-by: Borislav Petkov <[email protected]>

--
Regards/Gruss,
Boris.

2012-02-21 10:40:17

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 2/5] x86: Move per cpu cpu_llc_id to a field in struct cpuinfo_x86

On Mon, Feb 20, 2012 at 10:06:03PM -0400, Kevin Winchester wrote:
> This simplifies the various code paths using this field as it
> groups the per-cpu data together.
>
> Signed-off-by: Kevin Winchester <[email protected]>
> ---
> arch/x86/include/asm/processor.h | 2 ++
> arch/x86/include/asm/smp.h | 2 --
> arch/x86/kernel/apic/apic_numachip.c | 2 +-
> arch/x86/kernel/cpu/amd.c | 14 ++++----------
> arch/x86/kernel/cpu/common.c | 1 +
> arch/x86/kernel/cpu/intel_cacheinfo.c | 11 ++---------
> arch/x86/kernel/smpboot.c | 18 ++++++++----------
> 7 files changed, 18 insertions(+), 32 deletions(-)
>
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index 0c8b574..2d4eb6f 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -111,6 +111,8 @@ struct cpuinfo_x86 {
> u16 cpu_index;
> u32 microcode;
> cpumask_t llc_shared_map;
> + /* cpus sharing the last level cache: */

Just a minor nitpick which I forgot: we spell "CPUs" in the comments as
in the next patch you're adding a comment about the sibling_map.

Thanks.

--
Regards/Gruss,
Boris.

2012-02-21 11:35:32

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/5] x86: Move per cpu cpu_sibling_map to a field in struct cpuinfo_x86

On Mon, Feb 20, 2012 at 10:06:04PM -0400, Kevin Winchester wrote:
> This simplifies the various code paths using this field as it
> groups the per-cpu data together.
>
> Signed-off-by: Kevin Winchester <[email protected]>

Acked-by: Borislav Petkov <[email protected]>

--
Regards/Gruss,
Boris.

2012-02-21 14:21:29

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 4/5] x86: Move per cpu cpu_core_map to a field in struct cpuinfo_x86

On Mon, Feb 20, 2012 at 10:06:05PM -0400, Kevin Winchester wrote:
> This simplifies the various code paths using this field as it
> groups the per-cpu data together.
>
> Signed-off-by: Kevin Winchester <[email protected]>
> ---
> arch/x86/include/asm/processor.h | 2 ++
> arch/x86/include/asm/smp.h | 6 ------
> arch/x86/include/asm/topology.h | 4 ++--
> arch/x86/kernel/cpu/proc.c | 3 +--
> arch/x86/kernel/smpboot.c | 35 ++++++++++++++---------------------
> arch/x86/xen/smp.c | 4 ----
> drivers/cpufreq/acpi-cpufreq.c | 2 +-
> drivers/cpufreq/powernow-k8.c | 13 +++----------
> 8 files changed, 23 insertions(+), 46 deletions(-)
>
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index 38de3aa..53f7a8c 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -115,6 +115,8 @@ struct cpuinfo_x86 {
> u16 llc_id;
> /* representing HT siblings of each logical CPU */
> cpumask_t sibling_map;
> + /* representing HT and core siblings of each logical CPU */

Let's change that to be more clear:

"representing all execution threads on a logical CPU, i.e. per physical
socket"

or if someone has an even better formulation...

other than that:

Acked-by: Borislav Petkov <[email protected]>

--
Regards/Gruss,
Boris.

2012-02-21 15:39:34

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 5/5] x86: Remove #ifdef CONFIG_SMP sections by moving smp_num_siblings into common.c

On Mon, Feb 20, 2012 at 10:06:06PM -0400, Kevin Winchester wrote:
> smp_num_siblings was defined in arch/x86/kernel/smpboot.c, making it
> necessary to wrap any UP relevant code referencing it with #ifdef
> CONFIG_SMP.
>
> Instead, move the definition to arch/x86/kernel/cpu/common.c, thus
> making it available always.
>
> Signed-off-by: Kevin Winchester <[email protected]>
> ---
> arch/x86/include/asm/perf_event_p4.h | 9 +++------
> arch/x86/include/asm/smp.h | 6 +-----
> arch/x86/include/asm/topology.h | 4 +---
> arch/x86/kernel/cpu/amd.c | 4 ----
> arch/x86/kernel/cpu/common.c | 6 ++++--
> arch/x86/kernel/cpu/proc.c | 5 ++---
> arch/x86/kernel/cpu/topology.c | 2 --
> arch/x86/kernel/process.c | 3 +--
> arch/x86/kernel/smpboot.c | 4 ----
> arch/x86/oprofile/nmi_int.c | 6 ------
> arch/x86/oprofile/op_model_p4.c | 6 ------
> 11 files changed, 12 insertions(+), 43 deletions(-)
>
> diff --git a/arch/x86/include/asm/perf_event_p4.h b/arch/x86/include/asm/perf_event_p4.h
> index 29a65c2..0d50f33 100644
> --- a/arch/x86/include/asm/perf_event_p4.h
> +++ b/arch/x86/include/asm/perf_event_p4.h
> @@ -8,6 +8,8 @@
> #include <linux/cpu.h>
> #include <linux/bitops.h>
>
> +#include <asm/smp.h>
> +
> /*
> * NetBurst has performance MSRs shared between
> * threads if HT is turned on, ie for both logical
> @@ -179,18 +181,13 @@ static inline u64 p4_clear_ht_bit(u64 config)
>
> static inline int p4_ht_active(void)
> {
> -#ifdef CONFIG_SMP
> return smp_num_siblings > 1;
> -#endif
> - return 0;
> }

You could drop this function completely now and use the equal
smt_capable() macro below.

>
> static inline int p4_ht_thread(int cpu)
> {
> -#ifdef CONFIG_SMP
> if (smp_num_siblings == 2)
> - return cpu != cpumask_first(&cpu_data(cpu).sibling_map));
> -#endif
> + return cpu != cpumask_first(&cpu_data(cpu).sibling_map);
> return 0;
> }
>
> diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
> index 75aea4d..787127e 100644
> --- a/arch/x86/include/asm/smp.h
> +++ b/arch/x86/include/asm/smp.h
> @@ -24,11 +24,7 @@ extern unsigned int num_processors;
>
> static inline bool cpu_has_ht_siblings(void)
> {
> - bool has_siblings = false;
> -#ifdef CONFIG_SMP
> - has_siblings = cpu_has_ht && smp_num_siblings > 1;
> -#endif
> - return has_siblings;
> + return cpu_has_ht && smp_num_siblings > 1;
> }
>
> DECLARE_PER_CPU(int, cpu_number);
> diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
> index 58438a1b..7250ad1 100644
> --- a/arch/x86/include/asm/topology.h
> +++ b/arch/x86/include/asm/topology.h
> @@ -174,11 +174,9 @@ static inline void arch_fix_phys_package_id(int num, u32 slot)
> struct pci_bus;
> void x86_pci_root_bus_resources(int bus, struct list_head *resources);
>
> -#ifdef CONFIG_SMP
> #define mc_capable() ((boot_cpu_data.x86_max_cores > 1) && \
> (cpumask_weight(&boot_cpu_data.core_map) != nr_cpu_ids))
> -#define smt_capable() (smp_num_siblings > 1)
> -#endif
> +#define smt_capable() (smp_num_siblings > 1)
>
> #ifdef CONFIG_NUMA
> extern int get_mp_bus_to_node(int busnum);
> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index 1cd9d51..a8b46df 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -263,7 +263,6 @@ static int __cpuinit nearby_node(int apicid)
> * Assumption: Number of cores in each internal node is the same.
> * (2) AMD processors supporting compute units
> */
> -#ifdef CONFIG_X86_HT
> static void __cpuinit amd_get_topology(struct cpuinfo_x86 *c)
> {
> u32 nodes, cores_per_cu = 1;
> @@ -307,7 +306,6 @@ static void __cpuinit amd_get_topology(struct cpuinfo_x86 *c)
> c->compute_unit_id %= cus_per_node;
> }
> }
> -#endif

I'm assuming all those X86_HT changes have been built and boot-tested
also with CONFIG_X86_HT unset?

Thanks.

--
Regards/Gruss,
Boris.

2012-02-21 15:42:42

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 1/5] x86: Move per cpu cpu_llc_shared_map to a field in struct cpuinfo_x86

On Mon, Feb 20, 2012 at 10:06:02PM -0400, Kevin Winchester wrote:
> Commit 141168c36cde ("x86: Simplify code by removing a !SMP #ifdefs from
> 'struct cpuinfo_x86'") caused the compilation error:
>
> mce_amd.c:(.cpuinit.text+0x4723): undefined reference to 'cpu_llc_shared_map'
>
> by removing an #ifdef CONFIG_SMP around a block containing a reference
> to cpu_llc_shared_map. Rather than replace the #ifdef, move
> cpu_llc_shared_map to be a new cpumask_t field llc_shared_map in
> struct cpuinfo_x86 and adjust all references to cpu_llc_shared_map.
>
> The size effects on various kernels are as follows:
>
> text data bss dec hex filename
> 5281572 513296 1044480 6839348 685c34 vmlinux.up
> 5281572 513296 1044480 6839348 685c34 vmlinux.up.patched
> 5548860 516792 1110016 7175668 6d7df4 vmlinux.smp.2
> 5548837 516792 1110016 7175645 6d7ddd vmlinux.smp.2.patched
> 5595965 706840 1310720 7613525 742c55 vmlinux.smp.max
> 5595876 707880 1310720 7614476 74300c vmlinux.smp.max.patched
>
> It can be seen that this change has no effect on UP, a minor effect for
> SMP with Max 2 CPUs, and a more substantial but still not overly large
> effect for MAXSMP.
>
> Signed-off-by: Kevin Winchester <[email protected]>
> ---
> arch/x86/include/asm/processor.h | 1 +
> arch/x86/include/asm/smp.h | 6 ------
> arch/x86/kernel/cpu/intel_cacheinfo.c | 4 ++--
> arch/x86/kernel/cpu/mcheck/mce_amd.c | 7 ++++---
> arch/x86/kernel/smpboot.c | 15 ++++++---------
> arch/x86/xen/smp.c | 1 -
> 6 files changed, 13 insertions(+), 21 deletions(-)
>
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index 4b81258..0c8b574 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -110,6 +110,7 @@ struct cpuinfo_x86 {
> /* Index into per_cpu list: */
> u16 cpu_index;
> u32 microcode;
> + cpumask_t llc_shared_map;
> } __attribute__((__aligned__(SMP_CACHE_BYTES)));
>
> #define X86_VENDOR_INTEL 0
> diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
> index 0434c40..f7599d0 100644
> --- a/arch/x86/include/asm/smp.h
> +++ b/arch/x86/include/asm/smp.h
> @@ -34,7 +34,6 @@ static inline bool cpu_has_ht_siblings(void)
> DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map);
> DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
> /* cpus sharing the last level cache: */

You'd need to pull up that comment along with the llc_shared_map
definition above in the struct cpuinfo_x86 so that we know what kind of
map it is.

Thanks.

--
Regards/Gruss,
Boris.

2012-02-22 01:44:27

by Kevin Winchester

[permalink] [raw]
Subject: Re: [PATCH 5/5] x86: Remove #ifdef CONFIG_SMP sections by moving smp_num_siblings into common.c

On 21 February 2012 11:39, Borislav Petkov <[email protected]> wrote:
> On Mon, Feb 20, 2012 at 10:06:06PM -0400, Kevin Winchester wrote:
>> smp_num_siblings was defined in arch/x86/kernel/smpboot.c, making it
>> necessary to wrap any UP relevant code referencing it with #ifdef
>> CONFIG_SMP.
>>
>> Instead, move the definition to arch/x86/kernel/cpu/common.c, thus
>> making it available always.
>>
>> Signed-off-by: Kevin Winchester <[email protected]>
>> ---
>> ?arch/x86/include/asm/perf_event_p4.h | ? ?9 +++------
>> ?arch/x86/include/asm/smp.h ? ? ? ? ? | ? ?6 +-----
>> ?arch/x86/include/asm/topology.h ? ? ?| ? ?4 +---
>> ?arch/x86/kernel/cpu/amd.c ? ? ? ? ? ?| ? ?4 ----
>> ?arch/x86/kernel/cpu/common.c ? ? ? ? | ? ?6 ++++--
>> ?arch/x86/kernel/cpu/proc.c ? ? ? ? ? | ? ?5 ++---
>> ?arch/x86/kernel/cpu/topology.c ? ? ? | ? ?2 --
>> ?arch/x86/kernel/process.c ? ? ? ? ? ?| ? ?3 +--
>> ?arch/x86/kernel/smpboot.c ? ? ? ? ? ?| ? ?4 ----
>> ?arch/x86/oprofile/nmi_int.c ? ? ? ? ?| ? ?6 ------
>> ?arch/x86/oprofile/op_model_p4.c ? ? ?| ? ?6 ------
>> ?11 files changed, 12 insertions(+), 43 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/perf_event_p4.h b/arch/x86/include/asm/perf_event_p4.h
>> index 29a65c2..0d50f33 100644
>> --- a/arch/x86/include/asm/perf_event_p4.h
>> +++ b/arch/x86/include/asm/perf_event_p4.h
>> @@ -8,6 +8,8 @@
>> ?#include <linux/cpu.h>
>> ?#include <linux/bitops.h>
>>
>> +#include <asm/smp.h>
>> +
>> ?/*
>> ? * NetBurst has performance MSRs shared between
>> ? * threads if HT is turned on, ie for both logical
>> @@ -179,18 +181,13 @@ static inline u64 p4_clear_ht_bit(u64 config)
>>
>> ?static inline int p4_ht_active(void)
>> ?{
>> -#ifdef CONFIG_SMP
>> ? ? ? return smp_num_siblings > 1;
>> -#endif
>> - ? ? return 0;
>> ?}
>
> You could drop this function completely now and use the equal
> smt_capable() macro below.
>
>>
>> ?static inline int p4_ht_thread(int cpu)
>> ?{
>> -#ifdef CONFIG_SMP
>> ? ? ? if (smp_num_siblings == 2)
>> - ? ? ? ? ? ? return cpu != cpumask_first(&cpu_data(cpu).sibling_map));
>> -#endif
>> + ? ? ? ? ? ? return cpu != cpumask_first(&cpu_data(cpu).sibling_map);
>> ? ? ? return 0;
>> ?}
>>
>> diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
>> index 75aea4d..787127e 100644
>> --- a/arch/x86/include/asm/smp.h
>> +++ b/arch/x86/include/asm/smp.h
>> @@ -24,11 +24,7 @@ extern unsigned int num_processors;
>>
>> ?static inline bool cpu_has_ht_siblings(void)
>> ?{
>> - ? ? bool has_siblings = false;
>> -#ifdef CONFIG_SMP
>> - ? ? has_siblings = cpu_has_ht && smp_num_siblings > 1;
>> -#endif
>> - ? ? return has_siblings;
>> + ? ? return cpu_has_ht && smp_num_siblings > 1;
>> ?}
>>
>> ?DECLARE_PER_CPU(int, cpu_number);
>> diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
>> index 58438a1b..7250ad1 100644
>> --- a/arch/x86/include/asm/topology.h
>> +++ b/arch/x86/include/asm/topology.h
>> @@ -174,11 +174,9 @@ static inline void arch_fix_phys_package_id(int num, u32 slot)
>> ?struct pci_bus;
>> ?void x86_pci_root_bus_resources(int bus, struct list_head *resources);
>>
>> -#ifdef CONFIG_SMP
>> ?#define mc_capable() ((boot_cpu_data.x86_max_cores > 1) && \
>> ? ? ? ? ? ? ? ? ? ? ? (cpumask_weight(&boot_cpu_data.core_map) != nr_cpu_ids))
>> -#define smt_capable() ? ? ? ? ? ? ? ? ? ? ? ?(smp_num_siblings > 1)
>> -#endif
>> +#define smt_capable() ? ? ? ?(smp_num_siblings > 1)
>>
>> ?#ifdef CONFIG_NUMA
>> ?extern int get_mp_bus_to_node(int busnum);
>> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
>> index 1cd9d51..a8b46df 100644
>> --- a/arch/x86/kernel/cpu/amd.c
>> +++ b/arch/x86/kernel/cpu/amd.c
>> @@ -263,7 +263,6 @@ static int __cpuinit nearby_node(int apicid)
>> ? * ? ? Assumption: Number of cores in each internal node is the same.
>> ? * (2) AMD processors supporting compute units
>> ? */
>> -#ifdef CONFIG_X86_HT
>> ?static void __cpuinit amd_get_topology(struct cpuinfo_x86 *c)
>> ?{
>> ? ? ? u32 nodes, cores_per_cu = 1;
>> @@ -307,7 +306,6 @@ static void __cpuinit amd_get_topology(struct cpuinfo_x86 *c)
>> ? ? ? ? ? ? ? c->compute_unit_id %= cus_per_node;
>> ? ? ? }
>> ?}
>> -#endif
>
> I'm assuming all those X86_HT changes have been built and boot-tested
> also with CONFIG_X86_HT unset?
>

Yes, I am running a !CONFIG_SMP build with these patches applied right now.

New version of the patchset coming in a few minutes with all requested changes.

--
Kevin Winchester

2012-02-22 01:45:55

by Kevin Winchester

[permalink] [raw]
Subject: [PATCH v2 0/5] x86: Cleanup and simplify cpu-specific data

Various per-cpu fields are define in arch/x86/kernel/smpboot.c that are
basically equivalent to the cpu-specific data in struct cpuinfo_x86.
By moving these fields into the structure, a number of codepaths can be
simplified since they no longer need to care about those fields not
existing on !SMP builds.

The size effects on allno (UP) and allyes (MAX_SMP) kernels are as
follows:

text data bss dec hex filename
1586721 304864 506208 2397793 249661 vmlinux.allno
1588517 304928 505920 2399365 249c85 vmlinux.allno.after
84706053 13212311 42434560 140352924 85d9d9c vmlinux.allyes
84705333 13213799 42434560 140353692 85da09c vmlinux.allyes.afte

As can be seen, the kernels get slighly larger, but the code reduction/
simplification should be enough to compensate for it.

Kevin Winchester (5):
x86: Move per cpu cpu_llc_shared_map to a field in struct cpuinfo_x86
x86: Move per cpu cpu_llc_id to a field in struct cpuinfo_x86
x86: Move per cpu cpu_sibling_map to a field in struct cpuinfo_x86
x86: Move per cpu cpu_core_map to a field in struct cpuinfo_x86
x86: Remove #ifdef CONFIG_SMP sections by moving smp_num_siblings
into common.c

arch/x86/include/asm/perf_event_p4.h | 14 +----
arch/x86/include/asm/processor.h | 10 ++++
arch/x86/include/asm/smp.h | 26 +---------
arch/x86/include/asm/topology.h | 10 ++--
arch/x86/kernel/apic/apic_numachip.c | 2 +-
arch/x86/kernel/cpu/amd.c | 18 ++-----
arch/x86/kernel/cpu/common.c | 7 ++-
arch/x86/kernel/cpu/intel_cacheinfo.c | 19 ++-----
arch/x86/kernel/cpu/mcheck/mce_amd.c | 7 ++-
arch/x86/kernel/cpu/perf_event_p4.c | 4 +-
arch/x86/kernel/cpu/proc.c | 8 +--
arch/x86/kernel/cpu/topology.c | 2 -
arch/x86/kernel/process.c | 3 +-
arch/x86/kernel/smpboot.c | 95 +++++++++++++--------------------
arch/x86/oprofile/nmi_int.c | 6 --
arch/x86/oprofile/op_model_p4.c | 11 +----
arch/x86/xen/smp.c | 6 --
drivers/cpufreq/acpi-cpufreq.c | 2 +-
drivers/cpufreq/p4-clockmod.c | 4 +-
drivers/cpufreq/powernow-k8.c | 13 +----
drivers/cpufreq/speedstep-ich.c | 6 +-
drivers/hwmon/coretemp.c | 6 +--
22 files changed, 91 insertions(+), 188 deletions(-)

--
1.7.9

2012-02-22 01:45:59

by Kevin Winchester

[permalink] [raw]
Subject: [PATCH v2 1/5] x86: Move per cpu cpu_llc_shared_map to a field in struct cpuinfo_x86

Commit 141168c36cde ("x86: Simplify code by removing a !SMP #ifdefs from
'struct cpuinfo_x86'") caused the compilation error:

mce_amd.c:(.cpuinit.text+0x4723): undefined reference to 'cpu_llc_shared_map'

by removing an #ifdef CONFIG_SMP around a block containing a reference
to cpu_llc_shared_map. Rather than replace the #ifdef, move
cpu_llc_shared_map to be a new cpumask_t field llc_shared_map in
struct cpuinfo_x86 and adjust all references to cpu_llc_shared_map.

The size effects on various kernels are as follows:

text data bss dec hex filename
5281572 513296 1044480 6839348 685c34 vmlinux.up
5281572 513296 1044480 6839348 685c34 vmlinux.up.patched
5548860 516792 1110016 7175668 6d7df4 vmlinux.smp.2
5548837 516792 1110016 7175645 6d7ddd vmlinux.smp.2.patched
5595965 706840 1310720 7613525 742c55 vmlinux.smp.max
5595876 707880 1310720 7614476 74300c vmlinux.smp.max.patched

It can be seen that this change has no effect on UP, a minor effect for
SMP with Max 2 CPUs, and a more substantial but still not overly large
effect for MAXSMP.

Signed-off-by: Kevin Winchester <[email protected]>
---
arch/x86/include/asm/processor.h | 2 ++
arch/x86/include/asm/smp.h | 7 -------
arch/x86/kernel/cpu/intel_cacheinfo.c | 4 ++--
arch/x86/kernel/cpu/mcheck/mce_amd.c | 7 ++++---
arch/x86/kernel/smpboot.c | 15 ++++++---------
arch/x86/xen/smp.c | 1 -
6 files changed, 14 insertions(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 4b81258..9642867 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -110,6 +110,8 @@ struct cpuinfo_x86 {
/* Index into per_cpu list: */
u16 cpu_index;
u32 microcode;
+ /* CPUs sharing the last level cache: */
+ cpumask_t llc_shared_map;
} __attribute__((__aligned__(SMP_CACHE_BYTES)));

#define X86_VENDOR_INTEL 0
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 0434c40..61ebe324 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -33,8 +33,6 @@ static inline bool cpu_has_ht_siblings(void)

DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map);
DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
-/* cpus sharing the last level cache: */
-DECLARE_PER_CPU(cpumask_var_t, cpu_llc_shared_map);
DECLARE_PER_CPU(u16, cpu_llc_id);
DECLARE_PER_CPU(int, cpu_number);

@@ -48,11 +46,6 @@ static inline struct cpumask *cpu_core_mask(int cpu)
return per_cpu(cpu_core_map, cpu);
}

-static inline struct cpumask *cpu_llc_shared_mask(int cpu)
-{
- return per_cpu(cpu_llc_shared_map, cpu);
-}
-
DECLARE_EARLY_PER_CPU(u16, x86_cpu_to_apicid);
DECLARE_EARLY_PER_CPU(u16, x86_bios_cpu_apicid);
#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86_32)
diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c
index 73d08ed..a9cd551 100644
--- a/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -734,11 +734,11 @@ static int __cpuinit cache_shared_amd_cpu_map_setup(unsigned int cpu, int index)
ret = 0;
if (index == 3) {
ret = 1;
- for_each_cpu(i, cpu_llc_shared_mask(cpu)) {
+ for_each_cpu(i, &c->llc_shared_map) {
if (!per_cpu(ici_cpuid4_info, i))
continue;
this_leaf = CPUID4_INFO_IDX(i, index);
- for_each_cpu(sibling, cpu_llc_shared_mask(cpu)) {
+ for_each_cpu(sibling, &c->llc_shared_map) {
if (!cpu_online(sibling))
continue;
set_bit(sibling, this_leaf->shared_cpu_map);
diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index 786e76a..5e0ec2c 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -525,11 +525,12 @@ static __cpuinit int threshold_create_bank(unsigned int cpu, unsigned int bank)
struct threshold_bank *b = NULL;
struct device *dev = mce_device[cpu];
char name[32];
+ struct cpuinfo_x86 *c = &cpu_data(cpu);

sprintf(name, "threshold_bank%i", bank);

- if (cpu_data(cpu).cpu_core_id && shared_bank[bank]) { /* symlink */
- i = cpumask_first(cpu_llc_shared_mask(cpu));
+ if (c->cpu_core_id && shared_bank[bank]) { /* symlink */
+ i = cpumask_first(&c->llc_shared_map);

/* first core not up yet */
if (cpu_data(i).cpu_core_id)
@@ -548,7 +549,7 @@ static __cpuinit int threshold_create_bank(unsigned int cpu, unsigned int bank)
if (err)
goto out;

- cpumask_copy(b->cpus, cpu_llc_shared_mask(cpu));
+ cpumask_copy(b->cpus, &c->llc_shared_map);
per_cpu(threshold_banks, cpu)[bank] = b;

goto out;
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 257049d..1c20aa2 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -127,8 +127,6 @@ EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
DEFINE_PER_CPU(cpumask_var_t, cpu_core_map);
EXPORT_PER_CPU_SYMBOL(cpu_core_map);

-DEFINE_PER_CPU(cpumask_var_t, cpu_llc_shared_map);
-
/* Per CPU bogomips and other parameters */
DEFINE_PER_CPU_SHARED_ALIGNED(struct cpuinfo_x86, cpu_info);
EXPORT_PER_CPU_SYMBOL(cpu_info);
@@ -337,8 +335,8 @@ static void __cpuinit link_thread_siblings(int cpu1, int cpu2)
cpumask_set_cpu(cpu2, cpu_sibling_mask(cpu1));
cpumask_set_cpu(cpu1, cpu_core_mask(cpu2));
cpumask_set_cpu(cpu2, cpu_core_mask(cpu1));
- cpumask_set_cpu(cpu1, cpu_llc_shared_mask(cpu2));
- cpumask_set_cpu(cpu2, cpu_llc_shared_mask(cpu1));
+ cpumask_set_cpu(cpu1, &cpu_data(cpu2).llc_shared_map);
+ cpumask_set_cpu(cpu2, &cpu_data(cpu1).llc_shared_map);
}


@@ -367,7 +365,7 @@ void __cpuinit set_cpu_sibling_map(int cpu)
cpumask_set_cpu(cpu, cpu_sibling_mask(cpu));
}

- cpumask_set_cpu(cpu, cpu_llc_shared_mask(cpu));
+ cpumask_set_cpu(cpu, &c->llc_shared_map);

if (__this_cpu_read(cpu_info.x86_max_cores) == 1) {
cpumask_copy(cpu_core_mask(cpu), cpu_sibling_mask(cpu));
@@ -378,8 +376,8 @@ void __cpuinit set_cpu_sibling_map(int cpu)
for_each_cpu(i, cpu_sibling_setup_mask) {
if (per_cpu(cpu_llc_id, cpu) != BAD_APICID &&
per_cpu(cpu_llc_id, cpu) == per_cpu(cpu_llc_id, i)) {
- cpumask_set_cpu(i, cpu_llc_shared_mask(cpu));
- cpumask_set_cpu(cpu, cpu_llc_shared_mask(i));
+ cpumask_set_cpu(i, &c->llc_shared_map);
+ cpumask_set_cpu(cpu, &cpu_data(i).llc_shared_map);
}
if (c->phys_proc_id == cpu_data(i).phys_proc_id) {
cpumask_set_cpu(i, cpu_core_mask(cpu));
@@ -418,7 +416,7 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
!(cpu_has(c, X86_FEATURE_AMD_DCM)))
return cpu_core_mask(cpu);
else
- return cpu_llc_shared_mask(cpu);
+ return &c->llc_shared_map;
}

static void impress_friends(void)
@@ -1054,7 +1052,6 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
for_each_possible_cpu(i) {
zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
- zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), GFP_KERNEL);
}
set_cpu_sibling_map(0);

diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 501d4e0..b9f7a86 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -225,7 +225,6 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
for_each_possible_cpu(i) {
zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
- zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), GFP_KERNEL);
}
set_cpu_sibling_map(0);

--
1.7.9

2012-02-22 01:46:06

by Kevin Winchester

[permalink] [raw]
Subject: [PATCH v2 4/5] x86: Move per cpu cpu_core_map to a field in struct cpuinfo_x86

This simplifies the various code paths using this field as it
groups the per-cpu data together.

Acked-by: Borislav Petkov <[email protected]>
Signed-off-by: Kevin Winchester <[email protected]>
---
arch/x86/include/asm/processor.h | 5 +++++
arch/x86/include/asm/smp.h | 6 ------
arch/x86/include/asm/topology.h | 4 ++--
arch/x86/kernel/cpu/proc.c | 3 +--
arch/x86/kernel/smpboot.c | 35 ++++++++++++++---------------------
arch/x86/xen/smp.c | 4 ----
drivers/cpufreq/acpi-cpufreq.c | 2 +-
drivers/cpufreq/powernow-k8.c | 13 +++----------
8 files changed, 26 insertions(+), 46 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 266eaee..8bcb7ac 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -115,6 +115,11 @@ struct cpuinfo_x86 {
u16 llc_id;
/* representing HT siblings of each logical CPU */
cpumask_t sibling_map;
+ /*
+ * representing all execution threads on a logical CPU, i.e. per
+ * physical socket
+ */
+ cpumask_t core_map;
} __attribute__((__aligned__(SMP_CACHE_BYTES)));

#define X86_VENDOR_INTEL 0
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index b5e7cd2..75aea4d 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -31,14 +31,8 @@ static inline bool cpu_has_ht_siblings(void)
return has_siblings;
}

-DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
DECLARE_PER_CPU(int, cpu_number);

-static inline struct cpumask *cpu_core_mask(int cpu)
-{
- return per_cpu(cpu_core_map, cpu);
-}
-
DECLARE_EARLY_PER_CPU(u16, x86_cpu_to_apicid);
DECLARE_EARLY_PER_CPU(u16, x86_bios_cpu_apicid);
#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86_32)
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 5297acbf..58438a1b 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -160,7 +160,7 @@ extern const struct cpumask *cpu_coregroup_mask(int cpu);
#ifdef ENABLE_TOPO_DEFINES
#define topology_physical_package_id(cpu) (cpu_data(cpu).phys_proc_id)
#define topology_core_id(cpu) (cpu_data(cpu).cpu_core_id)
-#define topology_core_cpumask(cpu) (per_cpu(cpu_core_map, cpu))
+#define topology_core_cpumask(cpu) (&cpu_data(cpu).core_map)
#define topology_thread_cpumask(cpu) (&cpu_data(cpu).sibling_map)

/* indicates that pointers to the topology cpumask_t maps are valid */
@@ -176,7 +176,7 @@ void x86_pci_root_bus_resources(int bus, struct list_head *resources);

#ifdef CONFIG_SMP
#define mc_capable() ((boot_cpu_data.x86_max_cores > 1) && \
- (cpumask_weight(cpu_core_mask(0)) != nr_cpu_ids))
+ (cpumask_weight(&boot_cpu_data.core_map) != nr_cpu_ids))
#define smt_capable() (smp_num_siblings > 1)
#endif

diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
index 8022c66..e6e07c2 100644
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -13,8 +13,7 @@ static void show_cpuinfo_core(struct seq_file *m, struct cpuinfo_x86 *c,
#ifdef CONFIG_SMP
if (c->x86_max_cores * smp_num_siblings > 1) {
seq_printf(m, "physical id\t: %d\n", c->phys_proc_id);
- seq_printf(m, "siblings\t: %d\n",
- cpumask_weight(cpu_core_mask(cpu)));
+ seq_printf(m, "siblings\t: %d\n", cpumask_weight(&c->core_map));
seq_printf(m, "core id\t\t: %d\n", c->cpu_core_id);
seq_printf(m, "cpu cores\t: %d\n", c->booted_cores);
seq_printf(m, "apicid\t\t: %d\n", c->apicid);
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 991d17b..de23378 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -116,10 +116,6 @@ static struct task_struct *idle_thread_array[NR_CPUS] __cpuinitdata ;
int smp_num_siblings = 1;
EXPORT_SYMBOL(smp_num_siblings);

-/* representing HT and core siblings of each logical CPU */
-DEFINE_PER_CPU(cpumask_var_t, cpu_core_map);
-EXPORT_PER_CPU_SYMBOL(cpu_core_map);
-
/* Per CPU bogomips and other parameters */
DEFINE_PER_CPU_SHARED_ALIGNED(struct cpuinfo_x86, cpu_info);
EXPORT_PER_CPU_SYMBOL(cpu_info);
@@ -326,8 +322,8 @@ static void __cpuinit link_thread_siblings(int cpu1, int cpu2)
{
cpumask_set_cpu(cpu1, &cpu_data(cpu2).sibling_map);
cpumask_set_cpu(cpu2, &cpu_data(cpu1).sibling_map);
- cpumask_set_cpu(cpu1, cpu_core_mask(cpu2));
- cpumask_set_cpu(cpu2, cpu_core_mask(cpu1));
+ cpumask_set_cpu(cpu1, &cpu_data(cpu2).core_map);
+ cpumask_set_cpu(cpu2, &cpu_data(cpu1).core_map);
cpumask_set_cpu(cpu1, &cpu_data(cpu2).llc_shared_map);
cpumask_set_cpu(cpu2, &cpu_data(cpu1).llc_shared_map);
}
@@ -361,7 +357,7 @@ void __cpuinit set_cpu_sibling_map(int cpu)
cpumask_set_cpu(cpu, &c->llc_shared_map);

if (__this_cpu_read(cpu_info.x86_max_cores) == 1) {
- cpumask_copy(cpu_core_mask(cpu), &c->sibling_map);
+ cpumask_copy(&c->core_map, &c->sibling_map);
c->booted_cores = 1;
return;
}
@@ -374,8 +370,8 @@ void __cpuinit set_cpu_sibling_map(int cpu)
cpumask_set_cpu(cpu, &o->llc_shared_map);
}
if (c->phys_proc_id == o->phys_proc_id) {
- cpumask_set_cpu(i, cpu_core_mask(cpu));
- cpumask_set_cpu(cpu, cpu_core_mask(i));
+ cpumask_set_cpu(i, &c->core_map);
+ cpumask_set_cpu(cpu, &o->core_map);
/*
* Does this new cpu bringup a new core?
*/
@@ -404,11 +400,11 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
struct cpuinfo_x86 *c = &cpu_data(cpu);
/*
* For perf, we return last level cache shared map.
- * And for power savings, we return cpu_core_map
+ * And for power savings, we return core map.
*/
if ((sched_mc_power_savings || sched_smt_power_savings) &&
!(cpu_has(c, X86_FEATURE_AMD_DCM)))
- return cpu_core_mask(cpu);
+ return &c->core_map;
else
return &c->llc_shared_map;
}
@@ -907,7 +903,7 @@ static __init void disable_smp(void)
else
physid_set_mask_of_physid(0, &phys_cpu_present_map);
cpumask_set_cpu(0, &cpu_data(0).sibling_map);
- cpumask_set_cpu(0, cpu_core_mask(0));
+ cpumask_set_cpu(0, &cpu_data(0).core_map);
}

/*
@@ -1030,8 +1026,6 @@ static void __init smp_cpu_index_default(void)
*/
void __init native_smp_prepare_cpus(unsigned int max_cpus)
{
- unsigned int i;
-
preempt_disable();
smp_cpu_index_default();

@@ -1043,9 +1037,6 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
mb();

current_thread_info()->cpu = 0; /* needed? */
- for_each_possible_cpu(i) {
- zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
- }
set_cpu_sibling_map(0);


@@ -1233,19 +1224,21 @@ static void remove_siblinginfo(int cpu)
int sibling;
struct cpuinfo_x86 *c = &cpu_data(cpu);

- for_each_cpu(sibling, cpu_core_mask(cpu)) {
- cpumask_clear_cpu(cpu, cpu_core_mask(sibling));
+ for_each_cpu(sibling, &c->core_map) {
+ struct cpuinfo_x86 *o = &cpu_data(sibling);
+
+ cpumask_clear_cpu(cpu, &o->core_map);
/*/
* last thread sibling in this cpu core going down
*/
if (cpumask_weight(&c->sibling_map) == 1)
- cpu_data(sibling).booted_cores--;
+ o->booted_cores--;
}

for_each_cpu(sibling, &c->sibling_map)
cpumask_clear_cpu(cpu, &c->sibling_map);
cpumask_clear(&c->sibling_map);
- cpumask_clear(cpu_core_mask(cpu));
+ cpumask_clear(&c->core_map);
c->phys_proc_id = 0;
c->cpu_core_id = 0;
cpumask_clear_cpu(cpu, cpu_sibling_setup_mask);
diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 00f32c0..d1792ec 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -206,7 +206,6 @@ static void __init xen_smp_prepare_boot_cpu(void)
static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
{
unsigned cpu;
- unsigned int i;

if (skip_ioapic_setup) {
char *m = (max_cpus == 0) ?
@@ -222,9 +221,6 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
smp_store_cpu_info(0);
cpu_data(0).x86_max_cores = 1;

- for_each_possible_cpu(i) {
- zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
- }
set_cpu_sibling_map(0);

if (xen_smp_intr_init(0))
diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
index 56c6c6b..152af7f 100644
--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -557,7 +557,7 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy *policy)
dmi_check_system(sw_any_bug_dmi_table);
if (bios_with_sw_any_bug && cpumask_weight(policy->cpus) == 1) {
policy->shared_type = CPUFREQ_SHARED_TYPE_ALL;
- cpumask_copy(policy->cpus, cpu_core_mask(cpu));
+ cpumask_copy(policy->cpus, &c->core_map);
}
#endif

diff --git a/drivers/cpufreq/powernow-k8.c b/drivers/cpufreq/powernow-k8.c
index 8f9b2ce..da0767c 100644
--- a/drivers/cpufreq/powernow-k8.c
+++ b/drivers/cpufreq/powernow-k8.c
@@ -66,13 +66,6 @@ static struct msr __percpu *msrs;

static struct cpufreq_driver cpufreq_amd64_driver;

-#ifndef CONFIG_SMP
-static inline const struct cpumask *cpu_core_mask(int cpu)
-{
- return cpumask_of(0);
-}
-#endif
-
/* Return a frequency in MHz, given an input fid */
static u32 find_freq_from_fid(u32 fid)
{
@@ -715,7 +708,7 @@ static int fill_powernow_table(struct powernow_k8_data *data,

pr_debug("cfid 0x%x, cvid 0x%x\n", data->currfid, data->currvid);
data->powernow_table = powernow_table;
- if (cpumask_first(cpu_core_mask(data->cpu)) == data->cpu)
+ if (cpumask_first(&cpu_data(data->cpu).core_map) == data->cpu)
print_basics(data);

for (j = 0; j < data->numps; j++)
@@ -884,7 +877,7 @@ static int powernow_k8_cpu_init_acpi(struct powernow_k8_data *data)
powernow_table[data->acpi_data.state_count].index = 0;
data->powernow_table = powernow_table;

- if (cpumask_first(cpu_core_mask(data->cpu)) == data->cpu)
+ if (cpumask_first(&cpu_data(data->cpu).core_map) == data->cpu)
print_basics(data);

/* notify BIOS that we exist */
@@ -1326,7 +1319,7 @@ static int __cpuinit powernowk8_cpu_init(struct cpufreq_policy *pol)
if (cpu_family == CPU_HW_PSTATE)
cpumask_copy(pol->cpus, cpumask_of(pol->cpu));
else
- cpumask_copy(pol->cpus, cpu_core_mask(pol->cpu));
+ cpumask_copy(pol->cpus, &c->core_map);
data->available_cores = pol->cpus;

if (cpu_family == CPU_HW_PSTATE)
--
1.7.9

2012-02-22 01:46:04

by Kevin Winchester

[permalink] [raw]
Subject: [PATCH v2 3/5] x86: Move per cpu cpu_sibling_map to a field in struct cpuinfo_x86

This simplifies the various code paths using this field as it
groups the per-cpu data together.

Acked-by: Borislav Petkov <[email protected]>
Signed-off-by: Kevin Winchester <[email protected]>
---
arch/x86/include/asm/perf_event_p4.h | 2 +-
arch/x86/include/asm/processor.h | 2 ++
arch/x86/include/asm/smp.h | 6 ------
arch/x86/include/asm/topology.h | 2 +-
arch/x86/kernel/cpu/intel_cacheinfo.c | 4 ++--
arch/x86/kernel/smpboot.c | 27 +++++++++++----------------
arch/x86/oprofile/op_model_p4.c | 5 +----
arch/x86/xen/smp.c | 1 -
drivers/cpufreq/p4-clockmod.c | 4 +---
drivers/cpufreq/speedstep-ich.c | 6 +++---
drivers/hwmon/coretemp.c | 6 +-----
11 files changed, 23 insertions(+), 42 deletions(-)

diff --git a/arch/x86/include/asm/perf_event_p4.h b/arch/x86/include/asm/perf_event_p4.h
index 4f7e67e..29a65c2 100644
--- a/arch/x86/include/asm/perf_event_p4.h
+++ b/arch/x86/include/asm/perf_event_p4.h
@@ -189,7 +189,7 @@ static inline int p4_ht_thread(int cpu)
{
#ifdef CONFIG_SMP
if (smp_num_siblings == 2)
- return cpu != cpumask_first(__get_cpu_var(cpu_sibling_map));
+ return cpu != cpumask_first(&cpu_data(cpu).sibling_map));
#endif
return 0;
}
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 2ba6d0e..266eaee 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -113,6 +113,8 @@ struct cpuinfo_x86 {
/* CPUs sharing the last level cache: */
cpumask_t llc_shared_map;
u16 llc_id;
+ /* representing HT siblings of each logical CPU */
+ cpumask_t sibling_map;
} __attribute__((__aligned__(SMP_CACHE_BYTES)));

#define X86_VENDOR_INTEL 0
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 40d1c96..b5e7cd2 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -31,15 +31,9 @@ static inline bool cpu_has_ht_siblings(void)
return has_siblings;
}

-DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map);
DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
DECLARE_PER_CPU(int, cpu_number);

-static inline struct cpumask *cpu_sibling_mask(int cpu)
-{
- return per_cpu(cpu_sibling_map, cpu);
-}
-
static inline struct cpumask *cpu_core_mask(int cpu)
{
return per_cpu(cpu_core_map, cpu);
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index b9676ae..5297acbf 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -161,7 +161,7 @@ extern const struct cpumask *cpu_coregroup_mask(int cpu);
#define topology_physical_package_id(cpu) (cpu_data(cpu).phys_proc_id)
#define topology_core_id(cpu) (cpu_data(cpu).cpu_core_id)
#define topology_core_cpumask(cpu) (per_cpu(cpu_core_map, cpu))
-#define topology_thread_cpumask(cpu) (per_cpu(cpu_sibling_map, cpu))
+#define topology_thread_cpumask(cpu) (&cpu_data(cpu).sibling_map)

/* indicates that pointers to the topology cpumask_t maps are valid */
#define arch_provides_topology_pointers yes
diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c
index 5ddd6ef..7787d33 100644
--- a/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -739,11 +739,11 @@ static int __cpuinit cache_shared_amd_cpu_map_setup(unsigned int cpu, int index)
}
} else if ((c->x86 == 0x15) && ((index == 1) || (index == 2))) {
ret = 1;
- for_each_cpu(i, cpu_sibling_mask(cpu)) {
+ for_each_cpu(i, &c->sibling_map) {
if (!per_cpu(ici_cpuid4_info, i))
continue;
this_leaf = CPUID4_INFO_IDX(i, index);
- for_each_cpu(sibling, cpu_sibling_mask(cpu)) {
+ for_each_cpu(sibling, &c->sibling_map) {
if (!cpu_online(sibling))
continue;
set_bit(sibling, this_leaf->shared_cpu_map);
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index fb2ef30..991d17b 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -116,10 +116,6 @@ static struct task_struct *idle_thread_array[NR_CPUS] __cpuinitdata ;
int smp_num_siblings = 1;
EXPORT_SYMBOL(smp_num_siblings);

-/* representing HT siblings of each logical CPU */
-DEFINE_PER_CPU(cpumask_var_t, cpu_sibling_map);
-EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
-
/* representing HT and core siblings of each logical CPU */
DEFINE_PER_CPU(cpumask_var_t, cpu_core_map);
EXPORT_PER_CPU_SYMBOL(cpu_core_map);
@@ -328,8 +324,8 @@ void __cpuinit smp_store_cpu_info(int id)

static void __cpuinit link_thread_siblings(int cpu1, int cpu2)
{
- cpumask_set_cpu(cpu1, cpu_sibling_mask(cpu2));
- cpumask_set_cpu(cpu2, cpu_sibling_mask(cpu1));
+ cpumask_set_cpu(cpu1, &cpu_data(cpu2).sibling_map);
+ cpumask_set_cpu(cpu2, &cpu_data(cpu1).sibling_map);
cpumask_set_cpu(cpu1, cpu_core_mask(cpu2));
cpumask_set_cpu(cpu2, cpu_core_mask(cpu1));
cpumask_set_cpu(cpu1, &cpu_data(cpu2).llc_shared_map);
@@ -359,13 +355,13 @@ void __cpuinit set_cpu_sibling_map(int cpu)
}
}
} else {
- cpumask_set_cpu(cpu, cpu_sibling_mask(cpu));
+ cpumask_set_cpu(cpu, &c->sibling_map);
}

cpumask_set_cpu(cpu, &c->llc_shared_map);

if (__this_cpu_read(cpu_info.x86_max_cores) == 1) {
- cpumask_copy(cpu_core_mask(cpu), cpu_sibling_mask(cpu));
+ cpumask_copy(cpu_core_mask(cpu), &c->sibling_map);
c->booted_cores = 1;
return;
}
@@ -383,12 +379,12 @@ void __cpuinit set_cpu_sibling_map(int cpu)
/*
* Does this new cpu bringup a new core?
*/
- if (cpumask_weight(cpu_sibling_mask(cpu)) == 1) {
+ if (cpumask_weight(&c->sibling_map) == 1) {
/*
* for each core in package, increment
* the booted_cores for this new cpu
*/
- if (cpumask_first(cpu_sibling_mask(i)) == i)
+ if (cpumask_first(&o->sibling_map) == i)
c->booted_cores++;
/*
* increment the core count for all
@@ -910,7 +906,7 @@ static __init void disable_smp(void)
physid_set_mask_of_physid(boot_cpu_physical_apicid, &phys_cpu_present_map);
else
physid_set_mask_of_physid(0, &phys_cpu_present_map);
- cpumask_set_cpu(0, cpu_sibling_mask(0));
+ cpumask_set_cpu(0, &cpu_data(0).sibling_map);
cpumask_set_cpu(0, cpu_core_mask(0));
}

@@ -1048,7 +1044,6 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)

current_thread_info()->cpu = 0; /* needed? */
for_each_possible_cpu(i) {
- zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
}
set_cpu_sibling_map(0);
@@ -1243,13 +1238,13 @@ static void remove_siblinginfo(int cpu)
/*/
* last thread sibling in this cpu core going down
*/
- if (cpumask_weight(cpu_sibling_mask(cpu)) == 1)
+ if (cpumask_weight(&c->sibling_map) == 1)
cpu_data(sibling).booted_cores--;
}

- for_each_cpu(sibling, cpu_sibling_mask(cpu))
- cpumask_clear_cpu(cpu, cpu_sibling_mask(sibling));
- cpumask_clear(cpu_sibling_mask(cpu));
+ for_each_cpu(sibling, &c->sibling_map)
+ cpumask_clear_cpu(cpu, &c->sibling_map);
+ cpumask_clear(&c->sibling_map);
cpumask_clear(cpu_core_mask(cpu));
c->phys_proc_id = 0;
c->cpu_core_id = 0;
diff --git a/arch/x86/oprofile/op_model_p4.c b/arch/x86/oprofile/op_model_p4.c
index 98ab130..ae3503e 100644
--- a/arch/x86/oprofile/op_model_p4.c
+++ b/arch/x86/oprofile/op_model_p4.c
@@ -370,11 +370,8 @@ static struct p4_event_binding p4_events[NUM_EVENTS] = {
or "odd" part of all the divided resources. */
static unsigned int get_stagger(void)
{
-#ifdef CONFIG_SMP
int cpu = smp_processor_id();
- return cpu != cpumask_first(__get_cpu_var(cpu_sibling_map));
-#endif
- return 0;
+ return cpu != cpumask_first(&cpu_data(cpu).sibling_map);
}


diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index b9f7a86..00f32c0 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -223,7 +223,6 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
cpu_data(0).x86_max_cores = 1;

for_each_possible_cpu(i) {
- zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
}
set_cpu_sibling_map(0);
diff --git a/drivers/cpufreq/p4-clockmod.c b/drivers/cpufreq/p4-clockmod.c
index 6be3e07..a14b9b0 100644
--- a/drivers/cpufreq/p4-clockmod.c
+++ b/drivers/cpufreq/p4-clockmod.c
@@ -203,9 +203,7 @@ static int cpufreq_p4_cpu_init(struct cpufreq_policy *policy)
int cpuid = 0;
unsigned int i;

-#ifdef CONFIG_SMP
- cpumask_copy(policy->cpus, cpu_sibling_mask(policy->cpu));
-#endif
+ cpumask_copy(policy->cpus, &c->sibling_map);

/* Errata workaround */
cpuid = (c->x86 << 8) | (c->x86_model << 4) | c->x86_mask;
diff --git a/drivers/cpufreq/speedstep-ich.c b/drivers/cpufreq/speedstep-ich.c
index a748ce7..630926a 100644
--- a/drivers/cpufreq/speedstep-ich.c
+++ b/drivers/cpufreq/speedstep-ich.c
@@ -326,14 +326,14 @@ static void get_freqs_on_cpu(void *_get_freqs)

static int speedstep_cpu_init(struct cpufreq_policy *policy)
{
+ struct cpuinfo_x86 *c = &cpu_data(policy->cpu);
int result;
unsigned int policy_cpu, speed;
struct get_freqs gf;

/* only run on CPU to be set, or on its sibling */
-#ifdef CONFIG_SMP
- cpumask_copy(policy->cpus, cpu_sibling_mask(policy->cpu));
-#endif
+ cpumask_copy(policy->cpus, c->sibling_map);
+
policy_cpu = cpumask_any_and(policy->cpus, cpu_online_mask);

/* detect low and high frequency and transition latency */
diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index a6c6ec3..fdf1590 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -61,11 +61,7 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
#define TO_CORE_ID(cpu) cpu_data(cpu).cpu_core_id
#define TO_ATTR_NO(cpu) (TO_CORE_ID(cpu) + BASE_SYSFS_ATTR_NO)

-#ifdef CONFIG_SMP
-#define for_each_sibling(i, cpu) for_each_cpu(i, cpu_sibling_mask(cpu))
-#else
-#define for_each_sibling(i, cpu) for (i = 0; false; )
-#endif
+#define for_each_sibling(i, cpu) for_each_cpu(i, &cpu_data(cpu).sibling_map)

/*
* Per-Core Temperature Data
--
1.7.9

2012-02-22 01:46:29

by Kevin Winchester

[permalink] [raw]
Subject: [PATCH v2 5/5] x86: Remove #ifdef CONFIG_SMP sections by moving smp_num_siblings into common.c

smp_num_siblings was defined in arch/x86/kernel/smpboot.c, making it
necessary to wrap any UP relevant code referencing it with #ifdef
CONFIG_SMP.

Instead, move the definition to arch/x86/kernel/cpu/common.c, thus
making it available always.

Signed-off-by: Kevin Winchester <[email protected]>
---
arch/x86/include/asm/perf_event_p4.h | 14 +++-----------
arch/x86/include/asm/smp.h | 6 +-----
arch/x86/include/asm/topology.h | 4 +---
arch/x86/kernel/cpu/amd.c | 4 ----
arch/x86/kernel/cpu/common.c | 6 ++++--
arch/x86/kernel/cpu/perf_event_p4.c | 4 ++--
arch/x86/kernel/cpu/proc.c | 5 ++---
arch/x86/kernel/cpu/topology.c | 2 --
arch/x86/kernel/process.c | 3 +--
arch/x86/kernel/smpboot.c | 4 ----
arch/x86/oprofile/nmi_int.c | 6 ------
arch/x86/oprofile/op_model_p4.c | 6 ------
12 files changed, 14 insertions(+), 50 deletions(-)

diff --git a/arch/x86/include/asm/perf_event_p4.h b/arch/x86/include/asm/perf_event_p4.h
index 29a65c2..cfe41dc 100644
--- a/arch/x86/include/asm/perf_event_p4.h
+++ b/arch/x86/include/asm/perf_event_p4.h
@@ -8,6 +8,8 @@
#include <linux/cpu.h>
#include <linux/bitops.h>

+#include <asm/smp.h>
+
/*
* NetBurst has performance MSRs shared between
* threads if HT is turned on, ie for both logical
@@ -177,20 +179,10 @@ static inline u64 p4_clear_ht_bit(u64 config)
return config & ~P4_CONFIG_HT;
}

-static inline int p4_ht_active(void)
-{
-#ifdef CONFIG_SMP
- return smp_num_siblings > 1;
-#endif
- return 0;
-}
-
static inline int p4_ht_thread(int cpu)
{
-#ifdef CONFIG_SMP
if (smp_num_siblings == 2)
- return cpu != cpumask_first(&cpu_data(cpu).sibling_map));
-#endif
+ return cpu != cpumask_first(&cpu_data(cpu).sibling_map);
return 0;
}

diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 75aea4d..787127e 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -24,11 +24,7 @@ extern unsigned int num_processors;

static inline bool cpu_has_ht_siblings(void)
{
- bool has_siblings = false;
-#ifdef CONFIG_SMP
- has_siblings = cpu_has_ht && smp_num_siblings > 1;
-#endif
- return has_siblings;
+ return cpu_has_ht && smp_num_siblings > 1;
}

DECLARE_PER_CPU(int, cpu_number);
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 58438a1b..7250ad1 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -174,11 +174,9 @@ static inline void arch_fix_phys_package_id(int num, u32 slot)
struct pci_bus;
void x86_pci_root_bus_resources(int bus, struct list_head *resources);

-#ifdef CONFIG_SMP
#define mc_capable() ((boot_cpu_data.x86_max_cores > 1) && \
(cpumask_weight(&boot_cpu_data.core_map) != nr_cpu_ids))
-#define smt_capable() (smp_num_siblings > 1)
-#endif
+#define smt_capable() (smp_num_siblings > 1)

#ifdef CONFIG_NUMA
extern int get_mp_bus_to_node(int busnum);
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 1cd9d51..a8b46df 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -263,7 +263,6 @@ static int __cpuinit nearby_node(int apicid)
* Assumption: Number of cores in each internal node is the same.
* (2) AMD processors supporting compute units
*/
-#ifdef CONFIG_X86_HT
static void __cpuinit amd_get_topology(struct cpuinfo_x86 *c)
{
u32 nodes, cores_per_cu = 1;
@@ -307,7 +306,6 @@ static void __cpuinit amd_get_topology(struct cpuinfo_x86 *c)
c->compute_unit_id %= cus_per_node;
}
}
-#endif

/*
* On a AMD dual core setup the lower bits of the APIC id distingush the cores.
@@ -315,7 +313,6 @@ static void __cpuinit amd_get_topology(struct cpuinfo_x86 *c)
*/
static void __cpuinit amd_detect_cmp(struct cpuinfo_x86 *c)
{
-#ifdef CONFIG_X86_HT
unsigned bits;

bits = c->x86_coreid_bits;
@@ -326,7 +323,6 @@ static void __cpuinit amd_detect_cmp(struct cpuinfo_x86 *c)
/* use socket ID also for last level cache */
c->llc_id = c->phys_proc_id;
amd_get_topology(c);
-#endif
}

int amd_get_nb_id(int cpu)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 7052410..b2c6e3e 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -47,6 +47,10 @@ cpumask_var_t cpu_initialized_mask;
cpumask_var_t cpu_callout_mask;
cpumask_var_t cpu_callin_mask;

+/* Number of siblings per CPU package */
+int smp_num_siblings = 1;
+EXPORT_SYMBOL(smp_num_siblings);
+
/* representing cpus for which sibling maps can be computed */
cpumask_var_t cpu_sibling_setup_mask;

@@ -452,7 +456,6 @@ void __cpuinit cpu_detect_cache_sizes(struct cpuinfo_x86 *c)

void __cpuinit detect_ht(struct cpuinfo_x86 *c)
{
-#ifdef CONFIG_X86_HT
u32 eax, ebx, ecx, edx;
int index_msb, core_bits;
static bool printed;
@@ -498,7 +501,6 @@ out:
c->cpu_core_id);
printed = 1;
}
-#endif
}

static void __cpuinit get_cpu_vendor(struct cpuinfo_x86 *c)
diff --git a/arch/x86/kernel/cpu/perf_event_p4.c b/arch/x86/kernel/cpu/perf_event_p4.c
index ef484d9..9d1413d 100644
--- a/arch/x86/kernel/cpu/perf_event_p4.c
+++ b/arch/x86/kernel/cpu/perf_event_p4.c
@@ -775,7 +775,7 @@ static int p4_validate_raw_event(struct perf_event *event)
* if an event is shared across the logical threads
* the user needs special permissions to be able to use it
*/
- if (p4_ht_active() && p4_event_bind_map[v].shared) {
+ if (smt_capable() && p4_event_bind_map[v].shared) {
if (perf_paranoid_cpu() && !capable(CAP_SYS_ADMIN))
return -EACCES;
}
@@ -816,7 +816,7 @@ static int p4_hw_config(struct perf_event *event)
event->hw.config = p4_config_pack_escr(escr) |
p4_config_pack_cccr(cccr);

- if (p4_ht_active() && p4_ht_thread(cpu))
+ if (smt_capable() && p4_ht_thread(cpu))
event->hw.config = p4_set_ht_bit(event->hw.config);

if (event->attr.type == PERF_TYPE_RAW) {
diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
index e6e07c2..aef8b27 100644
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -1,16 +1,16 @@
-#include <linux/smp.h>
#include <linux/timex.h>
#include <linux/string.h>
#include <linux/seq_file.h>
#include <linux/cpufreq.h>

+#include <asm/smp.h>
+
/*
* Get CPU information for use by the procfs.
*/
static void show_cpuinfo_core(struct seq_file *m, struct cpuinfo_x86 *c,
unsigned int cpu)
{
-#ifdef CONFIG_SMP
if (c->x86_max_cores * smp_num_siblings > 1) {
seq_printf(m, "physical id\t: %d\n", c->phys_proc_id);
seq_printf(m, "siblings\t: %d\n", cpumask_weight(&c->core_map));
@@ -19,7 +19,6 @@ static void show_cpuinfo_core(struct seq_file *m, struct cpuinfo_x86 *c,
seq_printf(m, "apicid\t\t: %d\n", c->apicid);
seq_printf(m, "initial apicid\t: %d\n", c->initial_apicid);
}
-#endif
}

#ifdef CONFIG_X86_32
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 4397e98..d4ee471 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -28,7 +28,6 @@
*/
void __cpuinit detect_extended_topology(struct cpuinfo_x86 *c)
{
-#ifdef CONFIG_SMP
unsigned int eax, ebx, ecx, edx, sub_index;
unsigned int ht_mask_width, core_plus_mask_width;
unsigned int core_select_mask, core_level_siblings;
@@ -95,5 +94,4 @@ void __cpuinit detect_extended_topology(struct cpuinfo_x86 *c)
printed = 1;
}
return;
-#endif
}
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 44eefde..d531cd2 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -586,12 +586,11 @@ static void amd_e400_idle(void)

void __cpuinit select_idle_routine(const struct cpuinfo_x86 *c)
{
-#ifdef CONFIG_SMP
if (pm_idle == poll_idle && smp_num_siblings > 1) {
printk_once(KERN_WARNING "WARNING: polling idle and HT enabled,"
" performance may degrade.\n");
}
-#endif
+
if (pm_idle)
return;

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index de23378..a0a6b01 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -112,10 +112,6 @@ static struct task_struct *idle_thread_array[NR_CPUS] __cpuinitdata ;
#define set_idle_for_cpu(x, p) (idle_thread_array[(x)] = (p))
#endif

-/* Number of siblings per CPU package */
-int smp_num_siblings = 1;
-EXPORT_SYMBOL(smp_num_siblings);
-
/* Per CPU bogomips and other parameters */
DEFINE_PER_CPU_SHARED_ALIGNED(struct cpuinfo_x86, cpu_info);
EXPORT_PER_CPU_SYMBOL(cpu_info);
diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
index 26b8a85..346e7ac 100644
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -572,11 +572,6 @@ static int __init p4_init(char **cpu_type)
if (cpu_model > 6 || cpu_model == 5)
return 0;

-#ifndef CONFIG_SMP
- *cpu_type = "i386/p4";
- model = &op_p4_spec;
- return 1;
-#else
switch (smp_num_siblings) {
case 1:
*cpu_type = "i386/p4";
@@ -588,7 +583,6 @@ static int __init p4_init(char **cpu_type)
model = &op_p4_ht2_spec;
return 1;
}
-#endif

printk(KERN_INFO "oprofile: P4 HyperThreading detected with > 2 threads\n");
printk(KERN_INFO "oprofile: Reverting to timer mode.\n");
diff --git a/arch/x86/oprofile/op_model_p4.c b/arch/x86/oprofile/op_model_p4.c
index ae3503e..c6bcb22 100644
--- a/arch/x86/oprofile/op_model_p4.c
+++ b/arch/x86/oprofile/op_model_p4.c
@@ -42,21 +42,15 @@ static unsigned int num_controls = NUM_CONTROLS_NON_HT;
kernel boot-time. */
static inline void setup_num_counters(void)
{
-#ifdef CONFIG_SMP
if (smp_num_siblings == 2) {
num_counters = NUM_COUNTERS_HT2;
num_controls = NUM_CONTROLS_HT2;
}
-#endif
}

static inline int addr_increment(void)
{
-#ifdef CONFIG_SMP
return smp_num_siblings == 2 ? 2 : 1;
-#else
- return 1;
-#endif
}


--
1.7.9

2012-02-22 01:47:03

by Kevin Winchester

[permalink] [raw]
Subject: [PATCH v2 2/5] x86: Move per cpu cpu_llc_id to a field in struct cpuinfo_x86

This simplifies the various code paths using this field as it
groups the per-cpu data together.

Acked-by: Borislav Petkov <[email protected]>
Signed-off-by: Kevin Winchester <[email protected]>
---
arch/x86/include/asm/processor.h | 1 +
arch/x86/include/asm/smp.h | 1 -
arch/x86/kernel/apic/apic_numachip.c | 2 +-
arch/x86/kernel/cpu/amd.c | 14 ++++----------
arch/x86/kernel/cpu/common.c | 1 +
arch/x86/kernel/cpu/intel_cacheinfo.c | 11 ++---------
arch/x86/kernel/smpboot.c | 18 ++++++++----------
7 files changed, 17 insertions(+), 31 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 9642867..2ba6d0e 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -112,6 +112,7 @@ struct cpuinfo_x86 {
u32 microcode;
/* CPUs sharing the last level cache: */
cpumask_t llc_shared_map;
+ u16 llc_id;
} __attribute__((__aligned__(SMP_CACHE_BYTES)));

#define X86_VENDOR_INTEL 0
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 61ebe324..40d1c96 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -33,7 +33,6 @@ static inline bool cpu_has_ht_siblings(void)

DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map);
DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
-DECLARE_PER_CPU(u16, cpu_llc_id);
DECLARE_PER_CPU(int, cpu_number);

static inline struct cpumask *cpu_sibling_mask(int cpu)
diff --git a/arch/x86/kernel/apic/apic_numachip.c b/arch/x86/kernel/apic/apic_numachip.c
index 09d3d8c..73c46cf 100644
--- a/arch/x86/kernel/apic/apic_numachip.c
+++ b/arch/x86/kernel/apic/apic_numachip.c
@@ -202,7 +202,7 @@ static void __init map_csrs(void)
static void fixup_cpu_id(struct cpuinfo_x86 *c, int node)
{
c->phys_proc_id = node;
- per_cpu(cpu_llc_id, smp_processor_id()) = node;
+ c->llc_id = node;
}

static int __init numachip_system_init(void)
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 0a44b90..1cd9d51 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -268,7 +268,6 @@ static void __cpuinit amd_get_topology(struct cpuinfo_x86 *c)
{
u32 nodes, cores_per_cu = 1;
u8 node_id;
- int cpu = smp_processor_id();

/* get information required for multi-node processors */
if (cpu_has(c, X86_FEATURE_TOPOEXT)) {
@@ -301,7 +300,7 @@ static void __cpuinit amd_get_topology(struct cpuinfo_x86 *c)
cus_per_node = cores_per_node / cores_per_cu;

/* store NodeID, use llc_shared_map to store sibling info */
- per_cpu(cpu_llc_id, cpu) = node_id;
+ c->llc_id = node_id;

/* core id has to be in the [0 .. cores_per_node - 1] range */
c->cpu_core_id %= cores_per_node;
@@ -318,7 +317,6 @@ static void __cpuinit amd_detect_cmp(struct cpuinfo_x86 *c)
{
#ifdef CONFIG_X86_HT
unsigned bits;
- int cpu = smp_processor_id();

bits = c->x86_coreid_bits;
/* Low order bits define the core id (index of core in socket) */
@@ -326,18 +324,14 @@ static void __cpuinit amd_detect_cmp(struct cpuinfo_x86 *c)
/* Convert the initial APIC ID into the socket ID */
c->phys_proc_id = c->initial_apicid >> bits;
/* use socket ID also for last level cache */
- per_cpu(cpu_llc_id, cpu) = c->phys_proc_id;
+ c->llc_id = c->phys_proc_id;
amd_get_topology(c);
#endif
}

int amd_get_nb_id(int cpu)
{
- int id = 0;
-#ifdef CONFIG_SMP
- id = per_cpu(cpu_llc_id, cpu);
-#endif
- return id;
+ return cpu_data(cpu).llc_id;
}
EXPORT_SYMBOL_GPL(amd_get_nb_id);

@@ -350,7 +344,7 @@ static void __cpuinit srat_detect_node(struct cpuinfo_x86 *c)

node = numa_cpu_node(cpu);
if (node == NUMA_NO_NODE)
- node = per_cpu(cpu_llc_id, cpu);
+ node = c->llc_id;

/*
* If core numbers are inconsistent, it's likely a multi-fabric platform,
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 8b6a3bb..7052410 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -787,6 +787,7 @@ static void __cpuinit identify_cpu(struct cpuinfo_x86 *c)
c->x86_model_id[0] = '\0'; /* Unset */
c->x86_max_cores = 1;
c->x86_coreid_bits = 0;
+ c->llc_id = BAD_APICID;
#ifdef CONFIG_X86_64
c->x86_clflush_size = 64;
c->x86_phys_bits = 36;
diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c
index a9cd551..5ddd6ef 100644
--- a/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -579,9 +579,6 @@ unsigned int __cpuinit init_intel_cacheinfo(struct cpuinfo_x86 *c)
unsigned int new_l1d = 0, new_l1i = 0; /* Cache sizes from cpuid(4) */
unsigned int new_l2 = 0, new_l3 = 0, i; /* Cache sizes from cpuid(4) */
unsigned int l2_id = 0, l3_id = 0, num_threads_sharing, index_msb;
-#ifdef CONFIG_X86_HT
- unsigned int cpu = c->cpu_index;
-#endif

if (c->cpuid_level > 3) {
static int is_initialized;
@@ -700,16 +697,12 @@ unsigned int __cpuinit init_intel_cacheinfo(struct cpuinfo_x86 *c)

if (new_l2) {
l2 = new_l2;
-#ifdef CONFIG_X86_HT
- per_cpu(cpu_llc_id, cpu) = l2_id;
-#endif
+ c->llc_id = l2_id;
}

if (new_l3) {
l3 = new_l3;
-#ifdef CONFIG_X86_HT
- per_cpu(cpu_llc_id, cpu) = l3_id;
-#endif
+ c->llc_id = l3_id;
}

c->x86_cache_size = l3 ? l3 : (l2 ? l2 : (l1i+l1d));
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 1c20aa2..fb2ef30 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -116,9 +116,6 @@ static struct task_struct *idle_thread_array[NR_CPUS] __cpuinitdata ;
int smp_num_siblings = 1;
EXPORT_SYMBOL(smp_num_siblings);

-/* Last level cache ID of each logical CPU */
-DEFINE_PER_CPU(u16, cpu_llc_id) = BAD_APICID;
-
/* representing HT siblings of each logical CPU */
DEFINE_PER_CPU(cpumask_var_t, cpu_sibling_map);
EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
@@ -353,7 +350,7 @@ void __cpuinit set_cpu_sibling_map(int cpu)

if (cpu_has(c, X86_FEATURE_TOPOEXT)) {
if (c->phys_proc_id == o->phys_proc_id &&
- per_cpu(cpu_llc_id, cpu) == per_cpu(cpu_llc_id, i) &&
+ c->llc_id == o->llc_id &&
c->compute_unit_id == o->compute_unit_id)
link_thread_siblings(cpu, i);
} else if (c->phys_proc_id == o->phys_proc_id &&
@@ -374,12 +371,13 @@ void __cpuinit set_cpu_sibling_map(int cpu)
}

for_each_cpu(i, cpu_sibling_setup_mask) {
- if (per_cpu(cpu_llc_id, cpu) != BAD_APICID &&
- per_cpu(cpu_llc_id, cpu) == per_cpu(cpu_llc_id, i)) {
+ struct cpuinfo_x86 *o = &cpu_data(i);
+
+ if (c->llc_id != BAD_APICID && c->llc_id == o->llc_id) {
cpumask_set_cpu(i, &c->llc_shared_map);
- cpumask_set_cpu(cpu, &cpu_data(i).llc_shared_map);
+ cpumask_set_cpu(cpu, &o->llc_shared_map);
}
- if (c->phys_proc_id == cpu_data(i).phys_proc_id) {
+ if (c->phys_proc_id == o->phys_proc_id) {
cpumask_set_cpu(i, cpu_core_mask(cpu));
cpumask_set_cpu(cpu, cpu_core_mask(i));
/*
@@ -397,9 +395,9 @@ void __cpuinit set_cpu_sibling_map(int cpu)
* the other cpus in this package
*/
if (i != cpu)
- cpu_data(i).booted_cores++;
+ o->booted_cores++;
} else if (i != cpu && !c->booted_cores)
- c->booted_cores = cpu_data(i).booted_cores;
+ c->booted_cores = o->booted_cores;
}
}
}
--
1.7.9

2012-02-22 06:40:40

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH v2 1/5] x86: Move per cpu cpu_llc_shared_map to a field in struct cpuinfo_x86

On 02/21/2012 05:45 PM, Kevin Winchester wrote:
> Commit 141168c36cde ("x86: Simplify code by removing a !SMP #ifdefs from
> 'struct cpuinfo_x86'") caused the compilation error:
>
> mce_amd.c:(.cpuinit.text+0x4723): undefined reference to 'cpu_llc_shared_map'
>
> by removing an #ifdef CONFIG_SMP around a block containing a reference
> to cpu_llc_shared_map. Rather than replace the #ifdef, move
> cpu_llc_shared_map to be a new cpumask_t field llc_shared_map in
> struct cpuinfo_x86 and adjust all references to cpu_llc_shared_map.
>

I'm not comfortable with a patch this large after we are already at
-rc4. Please send a minimal patch to fix the failure for v3.3, and then
a patch on top of that which we can queue up with the rest of the
patchset to for v3.4.

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

2012-02-22 09:28:01

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v2 1/5] x86: Move per cpu cpu_llc_shared_map to a field in struct cpuinfo_x86


* H. Peter Anvin <[email protected]> wrote:

> On 02/21/2012 05:45 PM, Kevin Winchester wrote:
> > Commit 141168c36cde ("x86: Simplify code by removing a !SMP #ifdefs from
> > 'struct cpuinfo_x86'") caused the compilation error:
> >
> > mce_amd.c:(.cpuinit.text+0x4723): undefined reference to 'cpu_llc_shared_map'
> >
> > by removing an #ifdef CONFIG_SMP around a block containing a
> > reference to cpu_llc_shared_map. Rather than replace the
> > #ifdef, move cpu_llc_shared_map to be a new cpumask_t field
> > llc_shared_map in struct cpuinfo_x86 and adjust all
> > references to cpu_llc_shared_map.
> >
>
> I'm not comfortable with a patch this large after we are
> already at -rc4. Please send a minimal patch to fix the
> failure for v3.3, and then a patch on top of that which we can
> queue up with the rest of the patchset to for v3.4.

I forgot about the v3.3 aspect of the series - yes, you are
right, doing it like that would be preferred.

It can all be in the same series, we'll sort apart the v3.3 and
v3.4 bits.

Thanks,

Ingo

2012-02-22 12:24:26

by Kevin Winchester

[permalink] [raw]
Subject: Re: [PATCH v2 1/5] x86: Move per cpu cpu_llc_shared_map to a field in struct cpuinfo_x86

On 22 February 2012 05:27, Ingo Molnar <[email protected]> wrote:
>
> * H. Peter Anvin <[email protected]> wrote:
>
>> On 02/21/2012 05:45 PM, Kevin Winchester wrote:
>> > Commit 141168c36cde ("x86: Simplify code by removing a !SMP #ifdefs from
>> > 'struct cpuinfo_x86'") caused the compilation error:
>> >
>> > mce_amd.c:(.cpuinit.text+0x4723): undefined reference to 'cpu_llc_shared_map'
>> >
>> > by removing an #ifdef CONFIG_SMP around a block containing a
>> > reference to cpu_llc_shared_map. ?Rather than replace the
>> > #ifdef, move cpu_llc_shared_map to be a new cpumask_t field
>> > llc_shared_map in struct cpuinfo_x86 and adjust all
>> > references to cpu_llc_shared_map.
>> >
>>
>> I'm not comfortable with a patch this large after we are
>> already at -rc4. ?Please send a minimal patch to fix the
>> failure for v3.3, and then a patch on top of that which we can
>> queue up with the rest of the patchset to for v3.4.
>
> I forgot about the v3.3 aspect of the series - yes, you are
> right, doing it like that would be preferred.
>
> It can all be in the same series, we'll sort apart the v3.3 and
> v3.4 bits.
>

That seems quite reasonable, I was not really thinking about v3.3 or
v3.4 at all when working on this. Can I suggest, then, that you apply
Borislav's original patch:

https://lkml.org/lkml/2012/2/3/331

for now, and then I will rebase on top of that and remove the #ifdefs
as part of my first patch?

--
Kevin Winchester

2012-02-22 16:15:05

by Borislav Petkov

[permalink] [raw]
Subject: [tip:x86/urgent] x86/mce/AMD: Fix UP build error

Commit-ID: 3f806e50981825fa56a7f1938f24c0680816be45
Gitweb: http://git.kernel.org/tip/3f806e50981825fa56a7f1938f24c0680816be45
Author: Borislav Petkov <[email protected]>
AuthorDate: Fri, 3 Feb 2012 20:18:01 +0100
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 22 Feb 2012 13:36:30 +0100

x86/mce/AMD: Fix UP build error

141168c36cde ("x86: Simplify code by removing a !SMP #ifdefs
from 'struct cpuinfo_x86'") removed a bunch of CONFIG_SMP ifdefs
around code touching struct cpuinfo_x86 members but also caused
the following build error with Randy's randconfigs:

mce_amd.c:(.cpuinit.text+0x4723): undefined reference to `cpu_llc_shared_map'

Restore the #ifdef in threshold_create_bank() which creates
symlinks on the non-BSP CPUs.

There's a better patch series being worked on by Kevin Winchester
which will solve this in a cleaner fashion, but that series is
too ambitious for v3.3 merging - so we first queue up this trivial
fix and then do the rest for v3.4.

Signed-off-by: Borislav Petkov <[email protected]>
Acked-by: Kevin Winchester <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Nick Bowler <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/cpu/mcheck/mce_amd.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index 786e76a..e4eeaaf 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -528,6 +528,7 @@ static __cpuinit int threshold_create_bank(unsigned int cpu, unsigned int bank)

sprintf(name, "threshold_bank%i", bank);

+#ifdef CONFIG_SMP
if (cpu_data(cpu).cpu_core_id && shared_bank[bank]) { /* symlink */
i = cpumask_first(cpu_llc_shared_mask(cpu));

@@ -553,6 +554,7 @@ static __cpuinit int threshold_create_bank(unsigned int cpu, unsigned int bank)

goto out;
}
+#endif

b = kzalloc(sizeof(struct threshold_bank), GFP_KERNEL);
if (!b) {

2012-02-22 23:32:55

by Kevin Winchester

[permalink] [raw]
Subject: [PATCH v3 0/5] x86: Cleanup and simplify cpu-specific data

Various per-cpu fields are define in arch/x86/kernel/smpboot.c that are
basically equivalent to the cpu-specific data in struct cpuinfo_x86.
By moving these fields into the structure, a number of codepaths can be
simplified since they no longer need to care about those fields not
existing on !SMP builds.

The size effects on allno (UP) and allyes (MAX_SMP) kernels are as
follows:

text data bss dec hex filename
1586721 304864 506208 2397793 249661 vmlinux.allno
1588517 304928 505920 2399365 249c85 vmlinux.allno.after
84706053 13212311 42434560 140352924 85d9d9c vmlinux.allyes
84705333 13213799 42434560 140353692 85da09c vmlinux.allyes.afte

As can be seen, the kernels get slighly larger, but the code reduction/
simplification should be enough to compensate for it.

Kevin Winchester (5):
x86: Move per cpu cpu_llc_shared_map to a field in struct cpuinfo_x86
x86: Move per cpu cpu_llc_id to a field in struct cpuinfo_x86
x86: Move per cpu cpu_sibling_map to a field in struct cpuinfo_x86
x86: Move per cpu cpu_core_map to a field in struct cpuinfo_x86
x86: Remove #ifdef CONFIG_SMP sections by moving smp_num_siblings
into common.c

arch/x86/include/asm/perf_event_p4.h | 14 +----
arch/x86/include/asm/processor.h | 10 ++++
arch/x86/include/asm/smp.h | 26 +---------
arch/x86/include/asm/topology.h | 10 ++--
arch/x86/kernel/apic/apic_numachip.c | 2 +-
arch/x86/kernel/cpu/amd.c | 18 ++-----
arch/x86/kernel/cpu/common.c | 7 ++-
arch/x86/kernel/cpu/intel_cacheinfo.c | 19 ++-----
arch/x86/kernel/cpu/mcheck/mce_amd.c | 9 ++--
arch/x86/kernel/cpu/perf_event_p4.c | 4 +-
arch/x86/kernel/cpu/proc.c | 8 +--
arch/x86/kernel/cpu/topology.c | 2 -
arch/x86/kernel/process.c | 3 +-
arch/x86/kernel/smpboot.c | 95 +++++++++++++--------------------
arch/x86/oprofile/nmi_int.c | 6 --
arch/x86/oprofile/op_model_p4.c | 11 +----
arch/x86/xen/smp.c | 6 --
drivers/cpufreq/acpi-cpufreq.c | 2 +-
drivers/cpufreq/p4-clockmod.c | 4 +-
drivers/cpufreq/powernow-k8.c | 13 +----
drivers/cpufreq/speedstep-ich.c | 6 +-
drivers/hwmon/coretemp.c | 6 +--
22 files changed, 91 insertions(+), 190 deletions(-)

--
1.7.9

2012-02-22 23:33:00

by Kevin Winchester

[permalink] [raw]
Subject: [PATCH v3 1/5] x86: Move per cpu cpu_llc_shared_map to a field in struct cpuinfo_x86

Commit 141168c36cde ("x86: Simplify code by removing a !SMP #ifdefs from
'struct cpuinfo_x86'") caused the compilation error:

mce_amd.c:(.cpuinit.text+0x4723): undefined reference to 'cpu_llc_shared_map'

by removing an #ifdef CONFIG_SMP around a block containing a reference
to cpu_llc_shared_map. Rather than replace the #ifdef, move
cpu_llc_shared_map to be a new cpumask_t field llc_shared_map in
struct cpuinfo_x86 and adjust all references to cpu_llc_shared_map.

The size effects on various kernels are as follows:

text data bss dec hex filename
5281572 513296 1044480 6839348 685c34 vmlinux.up
5281572 513296 1044480 6839348 685c34 vmlinux.up.patched
5548860 516792 1110016 7175668 6d7df4 vmlinux.smp.2
5548837 516792 1110016 7175645 6d7ddd vmlinux.smp.2.patched
5595965 706840 1310720 7613525 742c55 vmlinux.smp.max
5595876 707880 1310720 7614476 74300c vmlinux.smp.max.patched

It can be seen that this change has no effect on UP, a minor effect for
SMP with Max 2 CPUs, and a more substantial but still not overly large
effect for MAXSMP.

Signed-off-by: Kevin Winchester <[email protected]>
---
arch/x86/include/asm/processor.h | 2 ++
arch/x86/include/asm/smp.h | 7 -------
arch/x86/kernel/cpu/intel_cacheinfo.c | 4 ++--
arch/x86/kernel/cpu/mcheck/mce_amd.c | 9 ++++-----
arch/x86/kernel/smpboot.c | 15 ++++++---------
arch/x86/xen/smp.c | 1 -
6 files changed, 14 insertions(+), 24 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index c59ff02..9fe3c5e 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -110,6 +110,8 @@ struct cpuinfo_x86 {
/* Index into per_cpu list: */
u16 cpu_index;
u32 microcode;
+ /* CPUs sharing the last level cache: */
+ cpumask_t llc_shared_map;
} __attribute__((__aligned__(SMP_CACHE_BYTES)));

#define X86_VENDOR_INTEL 0
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 0434c40..61ebe324 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -33,8 +33,6 @@ static inline bool cpu_has_ht_siblings(void)

DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map);
DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
-/* cpus sharing the last level cache: */
-DECLARE_PER_CPU(cpumask_var_t, cpu_llc_shared_map);
DECLARE_PER_CPU(u16, cpu_llc_id);
DECLARE_PER_CPU(int, cpu_number);

@@ -48,11 +46,6 @@ static inline struct cpumask *cpu_core_mask(int cpu)
return per_cpu(cpu_core_map, cpu);
}

-static inline struct cpumask *cpu_llc_shared_mask(int cpu)
-{
- return per_cpu(cpu_llc_shared_map, cpu);
-}
-
DECLARE_EARLY_PER_CPU(u16, x86_cpu_to_apicid);
DECLARE_EARLY_PER_CPU(u16, x86_bios_cpu_apicid);
#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86_32)
diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c
index 73d08ed..a9cd551 100644
--- a/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -734,11 +734,11 @@ static int __cpuinit cache_shared_amd_cpu_map_setup(unsigned int cpu, int index)
ret = 0;
if (index == 3) {
ret = 1;
- for_each_cpu(i, cpu_llc_shared_mask(cpu)) {
+ for_each_cpu(i, &c->llc_shared_map) {
if (!per_cpu(ici_cpuid4_info, i))
continue;
this_leaf = CPUID4_INFO_IDX(i, index);
- for_each_cpu(sibling, cpu_llc_shared_mask(cpu)) {
+ for_each_cpu(sibling, &c->llc_shared_map) {
if (!cpu_online(sibling))
continue;
set_bit(sibling, this_leaf->shared_cpu_map);
diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index e4eeaaf..5e0ec2c 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -525,12 +525,12 @@ static __cpuinit int threshold_create_bank(unsigned int cpu, unsigned int bank)
struct threshold_bank *b = NULL;
struct device *dev = mce_device[cpu];
char name[32];
+ struct cpuinfo_x86 *c = &cpu_data(cpu);

sprintf(name, "threshold_bank%i", bank);

-#ifdef CONFIG_SMP
- if (cpu_data(cpu).cpu_core_id && shared_bank[bank]) { /* symlink */
- i = cpumask_first(cpu_llc_shared_mask(cpu));
+ if (c->cpu_core_id && shared_bank[bank]) { /* symlink */
+ i = cpumask_first(&c->llc_shared_map);

/* first core not up yet */
if (cpu_data(i).cpu_core_id)
@@ -549,12 +549,11 @@ static __cpuinit int threshold_create_bank(unsigned int cpu, unsigned int bank)
if (err)
goto out;

- cpumask_copy(b->cpus, cpu_llc_shared_mask(cpu));
+ cpumask_copy(b->cpus, &c->llc_shared_map);
per_cpu(threshold_banks, cpu)[bank] = b;

goto out;
}
-#endif

b = kzalloc(sizeof(struct threshold_bank), GFP_KERNEL);
if (!b) {
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index f34f8b2..b988c13 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -127,8 +127,6 @@ EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
DEFINE_PER_CPU(cpumask_var_t, cpu_core_map);
EXPORT_PER_CPU_SYMBOL(cpu_core_map);

-DEFINE_PER_CPU(cpumask_var_t, cpu_llc_shared_map);
-
/* Per CPU bogomips and other parameters */
DEFINE_PER_CPU_SHARED_ALIGNED(struct cpuinfo_x86, cpu_info);
EXPORT_PER_CPU_SYMBOL(cpu_info);
@@ -337,8 +335,8 @@ static void __cpuinit link_thread_siblings(int cpu1, int cpu2)
cpumask_set_cpu(cpu2, cpu_sibling_mask(cpu1));
cpumask_set_cpu(cpu1, cpu_core_mask(cpu2));
cpumask_set_cpu(cpu2, cpu_core_mask(cpu1));
- cpumask_set_cpu(cpu1, cpu_llc_shared_mask(cpu2));
- cpumask_set_cpu(cpu2, cpu_llc_shared_mask(cpu1));
+ cpumask_set_cpu(cpu1, &cpu_data(cpu2).llc_shared_map);
+ cpumask_set_cpu(cpu2, &cpu_data(cpu1).llc_shared_map);
}


@@ -367,7 +365,7 @@ void __cpuinit set_cpu_sibling_map(int cpu)
cpumask_set_cpu(cpu, cpu_sibling_mask(cpu));
}

- cpumask_set_cpu(cpu, cpu_llc_shared_mask(cpu));
+ cpumask_set_cpu(cpu, &c->llc_shared_map);

if (__this_cpu_read(cpu_info.x86_max_cores) == 1) {
cpumask_copy(cpu_core_mask(cpu), cpu_sibling_mask(cpu));
@@ -378,8 +376,8 @@ void __cpuinit set_cpu_sibling_map(int cpu)
for_each_cpu(i, cpu_sibling_setup_mask) {
if (per_cpu(cpu_llc_id, cpu) != BAD_APICID &&
per_cpu(cpu_llc_id, cpu) == per_cpu(cpu_llc_id, i)) {
- cpumask_set_cpu(i, cpu_llc_shared_mask(cpu));
- cpumask_set_cpu(cpu, cpu_llc_shared_mask(i));
+ cpumask_set_cpu(i, &c->llc_shared_map);
+ cpumask_set_cpu(cpu, &cpu_data(i).llc_shared_map);
}
if (c->phys_proc_id == cpu_data(i).phys_proc_id) {
cpumask_set_cpu(i, cpu_core_mask(cpu));
@@ -418,7 +416,7 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
!(cpu_has(c, X86_FEATURE_AMD_DCM)))
return cpu_core_mask(cpu);
else
- return cpu_llc_shared_mask(cpu);
+ return &c->llc_shared_map;
}

static void impress_friends(void)
@@ -1052,7 +1050,6 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
for_each_possible_cpu(i) {
zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
- zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), GFP_KERNEL);
}
set_cpu_sibling_map(0);

diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 501d4e0..b9f7a86 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -225,7 +225,6 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
for_each_possible_cpu(i) {
zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
- zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), GFP_KERNEL);
}
set_cpu_sibling_map(0);

--
1.7.9

2012-02-22 23:33:25

by Kevin Winchester

[permalink] [raw]
Subject: [PATCH v3 3/5] x86: Move per cpu cpu_sibling_map to a field in struct cpuinfo_x86

This simplifies the various code paths using this field as it
groups the per-cpu data together.

Acked-by: Borislav Petkov <[email protected]>
Signed-off-by: Kevin Winchester <[email protected]>
---
arch/x86/include/asm/perf_event_p4.h | 2 +-
arch/x86/include/asm/processor.h | 2 ++
arch/x86/include/asm/smp.h | 6 ------
arch/x86/include/asm/topology.h | 2 +-
arch/x86/kernel/cpu/intel_cacheinfo.c | 4 ++--
arch/x86/kernel/smpboot.c | 27 +++++++++++----------------
arch/x86/oprofile/op_model_p4.c | 5 +----
arch/x86/xen/smp.c | 1 -
drivers/cpufreq/p4-clockmod.c | 4 +---
drivers/cpufreq/speedstep-ich.c | 6 +++---
drivers/hwmon/coretemp.c | 6 +-----
11 files changed, 23 insertions(+), 42 deletions(-)

diff --git a/arch/x86/include/asm/perf_event_p4.h b/arch/x86/include/asm/perf_event_p4.h
index 4f7e67e..29a65c2 100644
--- a/arch/x86/include/asm/perf_event_p4.h
+++ b/arch/x86/include/asm/perf_event_p4.h
@@ -189,7 +189,7 @@ static inline int p4_ht_thread(int cpu)
{
#ifdef CONFIG_SMP
if (smp_num_siblings == 2)
- return cpu != cpumask_first(__get_cpu_var(cpu_sibling_map));
+ return cpu != cpumask_first(&cpu_data(cpu).sibling_map));
#endif
return 0;
}
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 2d304f9..a3fce4e 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -113,6 +113,8 @@ struct cpuinfo_x86 {
/* CPUs sharing the last level cache: */
cpumask_t llc_shared_map;
u16 llc_id;
+ /* representing HT siblings of each logical CPU */
+ cpumask_t sibling_map;
} __attribute__((__aligned__(SMP_CACHE_BYTES)));

#define X86_VENDOR_INTEL 0
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 40d1c96..b5e7cd2 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -31,15 +31,9 @@ static inline bool cpu_has_ht_siblings(void)
return has_siblings;
}

-DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map);
DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
DECLARE_PER_CPU(int, cpu_number);

-static inline struct cpumask *cpu_sibling_mask(int cpu)
-{
- return per_cpu(cpu_sibling_map, cpu);
-}
-
static inline struct cpumask *cpu_core_mask(int cpu)
{
return per_cpu(cpu_core_map, cpu);
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index b9676ae..5297acbf 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -161,7 +161,7 @@ extern const struct cpumask *cpu_coregroup_mask(int cpu);
#define topology_physical_package_id(cpu) (cpu_data(cpu).phys_proc_id)
#define topology_core_id(cpu) (cpu_data(cpu).cpu_core_id)
#define topology_core_cpumask(cpu) (per_cpu(cpu_core_map, cpu))
-#define topology_thread_cpumask(cpu) (per_cpu(cpu_sibling_map, cpu))
+#define topology_thread_cpumask(cpu) (&cpu_data(cpu).sibling_map)

/* indicates that pointers to the topology cpumask_t maps are valid */
#define arch_provides_topology_pointers yes
diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c
index 5ddd6ef..7787d33 100644
--- a/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -739,11 +739,11 @@ static int __cpuinit cache_shared_amd_cpu_map_setup(unsigned int cpu, int index)
}
} else if ((c->x86 == 0x15) && ((index == 1) || (index == 2))) {
ret = 1;
- for_each_cpu(i, cpu_sibling_mask(cpu)) {
+ for_each_cpu(i, &c->sibling_map) {
if (!per_cpu(ici_cpuid4_info, i))
continue;
this_leaf = CPUID4_INFO_IDX(i, index);
- for_each_cpu(sibling, cpu_sibling_mask(cpu)) {
+ for_each_cpu(sibling, &c->sibling_map) {
if (!cpu_online(sibling))
continue;
set_bit(sibling, this_leaf->shared_cpu_map);
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 3210646..7e73ea7 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -116,10 +116,6 @@ static struct task_struct *idle_thread_array[NR_CPUS] __cpuinitdata ;
int smp_num_siblings = 1;
EXPORT_SYMBOL(smp_num_siblings);

-/* representing HT siblings of each logical CPU */
-DEFINE_PER_CPU(cpumask_var_t, cpu_sibling_map);
-EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
-
/* representing HT and core siblings of each logical CPU */
DEFINE_PER_CPU(cpumask_var_t, cpu_core_map);
EXPORT_PER_CPU_SYMBOL(cpu_core_map);
@@ -328,8 +324,8 @@ void __cpuinit smp_store_cpu_info(int id)

static void __cpuinit link_thread_siblings(int cpu1, int cpu2)
{
- cpumask_set_cpu(cpu1, cpu_sibling_mask(cpu2));
- cpumask_set_cpu(cpu2, cpu_sibling_mask(cpu1));
+ cpumask_set_cpu(cpu1, &cpu_data(cpu2).sibling_map);
+ cpumask_set_cpu(cpu2, &cpu_data(cpu1).sibling_map);
cpumask_set_cpu(cpu1, cpu_core_mask(cpu2));
cpumask_set_cpu(cpu2, cpu_core_mask(cpu1));
cpumask_set_cpu(cpu1, &cpu_data(cpu2).llc_shared_map);
@@ -359,13 +355,13 @@ void __cpuinit set_cpu_sibling_map(int cpu)
}
}
} else {
- cpumask_set_cpu(cpu, cpu_sibling_mask(cpu));
+ cpumask_set_cpu(cpu, &c->sibling_map);
}

cpumask_set_cpu(cpu, &c->llc_shared_map);

if (__this_cpu_read(cpu_info.x86_max_cores) == 1) {
- cpumask_copy(cpu_core_mask(cpu), cpu_sibling_mask(cpu));
+ cpumask_copy(cpu_core_mask(cpu), &c->sibling_map);
c->booted_cores = 1;
return;
}
@@ -383,12 +379,12 @@ void __cpuinit set_cpu_sibling_map(int cpu)
/*
* Does this new cpu bringup a new core?
*/
- if (cpumask_weight(cpu_sibling_mask(cpu)) == 1) {
+ if (cpumask_weight(&c->sibling_map) == 1) {
/*
* for each core in package, increment
* the booted_cores for this new cpu
*/
- if (cpumask_first(cpu_sibling_mask(i)) == i)
+ if (cpumask_first(&o->sibling_map) == i)
c->booted_cores++;
/*
* increment the core count for all
@@ -908,7 +904,7 @@ static __init void disable_smp(void)
physid_set_mask_of_physid(boot_cpu_physical_apicid, &phys_cpu_present_map);
else
physid_set_mask_of_physid(0, &phys_cpu_present_map);
- cpumask_set_cpu(0, cpu_sibling_mask(0));
+ cpumask_set_cpu(0, &cpu_data(0).sibling_map);
cpumask_set_cpu(0, cpu_core_mask(0));
}

@@ -1046,7 +1042,6 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)

current_thread_info()->cpu = 0; /* needed? */
for_each_possible_cpu(i) {
- zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
}
set_cpu_sibling_map(0);
@@ -1241,13 +1236,13 @@ static void remove_siblinginfo(int cpu)
/*/
* last thread sibling in this cpu core going down
*/
- if (cpumask_weight(cpu_sibling_mask(cpu)) == 1)
+ if (cpumask_weight(&c->sibling_map) == 1)
cpu_data(sibling).booted_cores--;
}

- for_each_cpu(sibling, cpu_sibling_mask(cpu))
- cpumask_clear_cpu(cpu, cpu_sibling_mask(sibling));
- cpumask_clear(cpu_sibling_mask(cpu));
+ for_each_cpu(sibling, &c->sibling_map)
+ cpumask_clear_cpu(cpu, &c->sibling_map);
+ cpumask_clear(&c->sibling_map);
cpumask_clear(cpu_core_mask(cpu));
c->phys_proc_id = 0;
c->cpu_core_id = 0;
diff --git a/arch/x86/oprofile/op_model_p4.c b/arch/x86/oprofile/op_model_p4.c
index 98ab130..ae3503e 100644
--- a/arch/x86/oprofile/op_model_p4.c
+++ b/arch/x86/oprofile/op_model_p4.c
@@ -370,11 +370,8 @@ static struct p4_event_binding p4_events[NUM_EVENTS] = {
or "odd" part of all the divided resources. */
static unsigned int get_stagger(void)
{
-#ifdef CONFIG_SMP
int cpu = smp_processor_id();
- return cpu != cpumask_first(__get_cpu_var(cpu_sibling_map));
-#endif
- return 0;
+ return cpu != cpumask_first(&cpu_data(cpu).sibling_map);
}


diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index b9f7a86..00f32c0 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -223,7 +223,6 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
cpu_data(0).x86_max_cores = 1;

for_each_possible_cpu(i) {
- zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
}
set_cpu_sibling_map(0);
diff --git a/drivers/cpufreq/p4-clockmod.c b/drivers/cpufreq/p4-clockmod.c
index 6be3e07..a14b9b0 100644
--- a/drivers/cpufreq/p4-clockmod.c
+++ b/drivers/cpufreq/p4-clockmod.c
@@ -203,9 +203,7 @@ static int cpufreq_p4_cpu_init(struct cpufreq_policy *policy)
int cpuid = 0;
unsigned int i;

-#ifdef CONFIG_SMP
- cpumask_copy(policy->cpus, cpu_sibling_mask(policy->cpu));
-#endif
+ cpumask_copy(policy->cpus, &c->sibling_map);

/* Errata workaround */
cpuid = (c->x86 << 8) | (c->x86_model << 4) | c->x86_mask;
diff --git a/drivers/cpufreq/speedstep-ich.c b/drivers/cpufreq/speedstep-ich.c
index a748ce7..630926a 100644
--- a/drivers/cpufreq/speedstep-ich.c
+++ b/drivers/cpufreq/speedstep-ich.c
@@ -326,14 +326,14 @@ static void get_freqs_on_cpu(void *_get_freqs)

static int speedstep_cpu_init(struct cpufreq_policy *policy)
{
+ struct cpuinfo_x86 *c = &cpu_data(policy->cpu);
int result;
unsigned int policy_cpu, speed;
struct get_freqs gf;

/* only run on CPU to be set, or on its sibling */
-#ifdef CONFIG_SMP
- cpumask_copy(policy->cpus, cpu_sibling_mask(policy->cpu));
-#endif
+ cpumask_copy(policy->cpus, c->sibling_map);
+
policy_cpu = cpumask_any_and(policy->cpus, cpu_online_mask);

/* detect low and high frequency and transition latency */
diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index a6c6ec3..fdf1590 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -61,11 +61,7 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
#define TO_CORE_ID(cpu) cpu_data(cpu).cpu_core_id
#define TO_ATTR_NO(cpu) (TO_CORE_ID(cpu) + BASE_SYSFS_ATTR_NO)

-#ifdef CONFIG_SMP
-#define for_each_sibling(i, cpu) for_each_cpu(i, cpu_sibling_mask(cpu))
-#else
-#define for_each_sibling(i, cpu) for (i = 0; false; )
-#endif
+#define for_each_sibling(i, cpu) for_each_cpu(i, &cpu_data(cpu).sibling_map)

/*
* Per-Core Temperature Data
--
1.7.9

2012-02-22 23:33:05

by Kevin Winchester

[permalink] [raw]
Subject: [PATCH v3 4/5] x86: Move per cpu cpu_core_map to a field in struct cpuinfo_x86

This simplifies the various code paths using this field as it
groups the per-cpu data together.

Acked-by: Borislav Petkov <[email protected]>
Signed-off-by: Kevin Winchester <[email protected]>
---
arch/x86/include/asm/processor.h | 5 +++++
arch/x86/include/asm/smp.h | 6 ------
arch/x86/include/asm/topology.h | 4 ++--
arch/x86/kernel/cpu/proc.c | 3 +--
arch/x86/kernel/smpboot.c | 35 ++++++++++++++---------------------
arch/x86/xen/smp.c | 4 ----
drivers/cpufreq/acpi-cpufreq.c | 2 +-
drivers/cpufreq/powernow-k8.c | 13 +++----------
8 files changed, 26 insertions(+), 46 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index a3fce4e..35ab05b 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -115,6 +115,11 @@ struct cpuinfo_x86 {
u16 llc_id;
/* representing HT siblings of each logical CPU */
cpumask_t sibling_map;
+ /*
+ * representing all execution threads on a logical CPU, i.e. per
+ * physical socket
+ */
+ cpumask_t core_map;
} __attribute__((__aligned__(SMP_CACHE_BYTES)));

#define X86_VENDOR_INTEL 0
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index b5e7cd2..75aea4d 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -31,14 +31,8 @@ static inline bool cpu_has_ht_siblings(void)
return has_siblings;
}

-DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
DECLARE_PER_CPU(int, cpu_number);

-static inline struct cpumask *cpu_core_mask(int cpu)
-{
- return per_cpu(cpu_core_map, cpu);
-}
-
DECLARE_EARLY_PER_CPU(u16, x86_cpu_to_apicid);
DECLARE_EARLY_PER_CPU(u16, x86_bios_cpu_apicid);
#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86_32)
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 5297acbf..58438a1b 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -160,7 +160,7 @@ extern const struct cpumask *cpu_coregroup_mask(int cpu);
#ifdef ENABLE_TOPO_DEFINES
#define topology_physical_package_id(cpu) (cpu_data(cpu).phys_proc_id)
#define topology_core_id(cpu) (cpu_data(cpu).cpu_core_id)
-#define topology_core_cpumask(cpu) (per_cpu(cpu_core_map, cpu))
+#define topology_core_cpumask(cpu) (&cpu_data(cpu).core_map)
#define topology_thread_cpumask(cpu) (&cpu_data(cpu).sibling_map)

/* indicates that pointers to the topology cpumask_t maps are valid */
@@ -176,7 +176,7 @@ void x86_pci_root_bus_resources(int bus, struct list_head *resources);

#ifdef CONFIG_SMP
#define mc_capable() ((boot_cpu_data.x86_max_cores > 1) && \
- (cpumask_weight(cpu_core_mask(0)) != nr_cpu_ids))
+ (cpumask_weight(&boot_cpu_data.core_map) != nr_cpu_ids))
#define smt_capable() (smp_num_siblings > 1)
#endif

diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
index 8022c66..e6e07c2 100644
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -13,8 +13,7 @@ static void show_cpuinfo_core(struct seq_file *m, struct cpuinfo_x86 *c,
#ifdef CONFIG_SMP
if (c->x86_max_cores * smp_num_siblings > 1) {
seq_printf(m, "physical id\t: %d\n", c->phys_proc_id);
- seq_printf(m, "siblings\t: %d\n",
- cpumask_weight(cpu_core_mask(cpu)));
+ seq_printf(m, "siblings\t: %d\n", cpumask_weight(&c->core_map));
seq_printf(m, "core id\t\t: %d\n", c->cpu_core_id);
seq_printf(m, "cpu cores\t: %d\n", c->booted_cores);
seq_printf(m, "apicid\t\t: %d\n", c->apicid);
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 7e73ea7..3a4908d 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -116,10 +116,6 @@ static struct task_struct *idle_thread_array[NR_CPUS] __cpuinitdata ;
int smp_num_siblings = 1;
EXPORT_SYMBOL(smp_num_siblings);

-/* representing HT and core siblings of each logical CPU */
-DEFINE_PER_CPU(cpumask_var_t, cpu_core_map);
-EXPORT_PER_CPU_SYMBOL(cpu_core_map);
-
/* Per CPU bogomips and other parameters */
DEFINE_PER_CPU_SHARED_ALIGNED(struct cpuinfo_x86, cpu_info);
EXPORT_PER_CPU_SYMBOL(cpu_info);
@@ -326,8 +322,8 @@ static void __cpuinit link_thread_siblings(int cpu1, int cpu2)
{
cpumask_set_cpu(cpu1, &cpu_data(cpu2).sibling_map);
cpumask_set_cpu(cpu2, &cpu_data(cpu1).sibling_map);
- cpumask_set_cpu(cpu1, cpu_core_mask(cpu2));
- cpumask_set_cpu(cpu2, cpu_core_mask(cpu1));
+ cpumask_set_cpu(cpu1, &cpu_data(cpu2).core_map);
+ cpumask_set_cpu(cpu2, &cpu_data(cpu1).core_map);
cpumask_set_cpu(cpu1, &cpu_data(cpu2).llc_shared_map);
cpumask_set_cpu(cpu2, &cpu_data(cpu1).llc_shared_map);
}
@@ -361,7 +357,7 @@ void __cpuinit set_cpu_sibling_map(int cpu)
cpumask_set_cpu(cpu, &c->llc_shared_map);

if (__this_cpu_read(cpu_info.x86_max_cores) == 1) {
- cpumask_copy(cpu_core_mask(cpu), &c->sibling_map);
+ cpumask_copy(&c->core_map, &c->sibling_map);
c->booted_cores = 1;
return;
}
@@ -374,8 +370,8 @@ void __cpuinit set_cpu_sibling_map(int cpu)
cpumask_set_cpu(cpu, &o->llc_shared_map);
}
if (c->phys_proc_id == o->phys_proc_id) {
- cpumask_set_cpu(i, cpu_core_mask(cpu));
- cpumask_set_cpu(cpu, cpu_core_mask(i));
+ cpumask_set_cpu(i, &c->core_map);
+ cpumask_set_cpu(cpu, &o->core_map);
/*
* Does this new cpu bringup a new core?
*/
@@ -404,11 +400,11 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
struct cpuinfo_x86 *c = &cpu_data(cpu);
/*
* For perf, we return last level cache shared map.
- * And for power savings, we return cpu_core_map
+ * And for power savings, we return core map.
*/
if ((sched_mc_power_savings || sched_smt_power_savings) &&
!(cpu_has(c, X86_FEATURE_AMD_DCM)))
- return cpu_core_mask(cpu);
+ return &c->core_map;
else
return &c->llc_shared_map;
}
@@ -905,7 +901,7 @@ static __init void disable_smp(void)
else
physid_set_mask_of_physid(0, &phys_cpu_present_map);
cpumask_set_cpu(0, &cpu_data(0).sibling_map);
- cpumask_set_cpu(0, cpu_core_mask(0));
+ cpumask_set_cpu(0, &cpu_data(0).core_map);
}

/*
@@ -1028,8 +1024,6 @@ static void __init smp_cpu_index_default(void)
*/
void __init native_smp_prepare_cpus(unsigned int max_cpus)
{
- unsigned int i;
-
preempt_disable();
smp_cpu_index_default();

@@ -1041,9 +1035,6 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
mb();

current_thread_info()->cpu = 0; /* needed? */
- for_each_possible_cpu(i) {
- zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
- }
set_cpu_sibling_map(0);


@@ -1231,19 +1222,21 @@ static void remove_siblinginfo(int cpu)
int sibling;
struct cpuinfo_x86 *c = &cpu_data(cpu);

- for_each_cpu(sibling, cpu_core_mask(cpu)) {
- cpumask_clear_cpu(cpu, cpu_core_mask(sibling));
+ for_each_cpu(sibling, &c->core_map) {
+ struct cpuinfo_x86 *o = &cpu_data(sibling);
+
+ cpumask_clear_cpu(cpu, &o->core_map);
/*/
* last thread sibling in this cpu core going down
*/
if (cpumask_weight(&c->sibling_map) == 1)
- cpu_data(sibling).booted_cores--;
+ o->booted_cores--;
}

for_each_cpu(sibling, &c->sibling_map)
cpumask_clear_cpu(cpu, &c->sibling_map);
cpumask_clear(&c->sibling_map);
- cpumask_clear(cpu_core_mask(cpu));
+ cpumask_clear(&c->core_map);
c->phys_proc_id = 0;
c->cpu_core_id = 0;
cpumask_clear_cpu(cpu, cpu_sibling_setup_mask);
diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 00f32c0..d1792ec 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -206,7 +206,6 @@ static void __init xen_smp_prepare_boot_cpu(void)
static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
{
unsigned cpu;
- unsigned int i;

if (skip_ioapic_setup) {
char *m = (max_cpus == 0) ?
@@ -222,9 +221,6 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
smp_store_cpu_info(0);
cpu_data(0).x86_max_cores = 1;

- for_each_possible_cpu(i) {
- zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
- }
set_cpu_sibling_map(0);

if (xen_smp_intr_init(0))
diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
index 56c6c6b..152af7f 100644
--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -557,7 +557,7 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy *policy)
dmi_check_system(sw_any_bug_dmi_table);
if (bios_with_sw_any_bug && cpumask_weight(policy->cpus) == 1) {
policy->shared_type = CPUFREQ_SHARED_TYPE_ALL;
- cpumask_copy(policy->cpus, cpu_core_mask(cpu));
+ cpumask_copy(policy->cpus, &c->core_map);
}
#endif

diff --git a/drivers/cpufreq/powernow-k8.c b/drivers/cpufreq/powernow-k8.c
index 8f9b2ce..da0767c 100644
--- a/drivers/cpufreq/powernow-k8.c
+++ b/drivers/cpufreq/powernow-k8.c
@@ -66,13 +66,6 @@ static struct msr __percpu *msrs;

static struct cpufreq_driver cpufreq_amd64_driver;

-#ifndef CONFIG_SMP
-static inline const struct cpumask *cpu_core_mask(int cpu)
-{
- return cpumask_of(0);
-}
-#endif
-
/* Return a frequency in MHz, given an input fid */
static u32 find_freq_from_fid(u32 fid)
{
@@ -715,7 +708,7 @@ static int fill_powernow_table(struct powernow_k8_data *data,

pr_debug("cfid 0x%x, cvid 0x%x\n", data->currfid, data->currvid);
data->powernow_table = powernow_table;
- if (cpumask_first(cpu_core_mask(data->cpu)) == data->cpu)
+ if (cpumask_first(&cpu_data(data->cpu).core_map) == data->cpu)
print_basics(data);

for (j = 0; j < data->numps; j++)
@@ -884,7 +877,7 @@ static int powernow_k8_cpu_init_acpi(struct powernow_k8_data *data)
powernow_table[data->acpi_data.state_count].index = 0;
data->powernow_table = powernow_table;

- if (cpumask_first(cpu_core_mask(data->cpu)) == data->cpu)
+ if (cpumask_first(&cpu_data(data->cpu).core_map) == data->cpu)
print_basics(data);

/* notify BIOS that we exist */
@@ -1326,7 +1319,7 @@ static int __cpuinit powernowk8_cpu_init(struct cpufreq_policy *pol)
if (cpu_family == CPU_HW_PSTATE)
cpumask_copy(pol->cpus, cpumask_of(pol->cpu));
else
- cpumask_copy(pol->cpus, cpu_core_mask(pol->cpu));
+ cpumask_copy(pol->cpus, &c->core_map);
data->available_cores = pol->cpus;

if (cpu_family == CPU_HW_PSTATE)
--
1.7.9

2012-02-22 23:33:22

by Kevin Winchester

[permalink] [raw]
Subject: [PATCH v3 5/5] x86: Remove #ifdef CONFIG_SMP sections by moving smp_num_siblings into common.c

smp_num_siblings was defined in arch/x86/kernel/smpboot.c, making it
necessary to wrap any UP relevant code referencing it with #ifdef
CONFIG_SMP.

Instead, move the definition to arch/x86/kernel/cpu/common.c, thus
making it available always.

Signed-off-by: Kevin Winchester <[email protected]>
---
arch/x86/include/asm/perf_event_p4.h | 14 +++-----------
arch/x86/include/asm/smp.h | 6 +-----
arch/x86/include/asm/topology.h | 4 +---
arch/x86/kernel/cpu/amd.c | 4 ----
arch/x86/kernel/cpu/common.c | 6 ++++--
arch/x86/kernel/cpu/perf_event_p4.c | 4 ++--
arch/x86/kernel/cpu/proc.c | 5 ++---
arch/x86/kernel/cpu/topology.c | 2 --
arch/x86/kernel/process.c | 3 +--
arch/x86/kernel/smpboot.c | 4 ----
arch/x86/oprofile/nmi_int.c | 6 ------
arch/x86/oprofile/op_model_p4.c | 6 ------
12 files changed, 14 insertions(+), 50 deletions(-)

diff --git a/arch/x86/include/asm/perf_event_p4.h b/arch/x86/include/asm/perf_event_p4.h
index 29a65c2..cfe41dc 100644
--- a/arch/x86/include/asm/perf_event_p4.h
+++ b/arch/x86/include/asm/perf_event_p4.h
@@ -8,6 +8,8 @@
#include <linux/cpu.h>
#include <linux/bitops.h>

+#include <asm/smp.h>
+
/*
* NetBurst has performance MSRs shared between
* threads if HT is turned on, ie for both logical
@@ -177,20 +179,10 @@ static inline u64 p4_clear_ht_bit(u64 config)
return config & ~P4_CONFIG_HT;
}

-static inline int p4_ht_active(void)
-{
-#ifdef CONFIG_SMP
- return smp_num_siblings > 1;
-#endif
- return 0;
-}
-
static inline int p4_ht_thread(int cpu)
{
-#ifdef CONFIG_SMP
if (smp_num_siblings == 2)
- return cpu != cpumask_first(&cpu_data(cpu).sibling_map));
-#endif
+ return cpu != cpumask_first(&cpu_data(cpu).sibling_map);
return 0;
}

diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 75aea4d..787127e 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -24,11 +24,7 @@ extern unsigned int num_processors;

static inline bool cpu_has_ht_siblings(void)
{
- bool has_siblings = false;
-#ifdef CONFIG_SMP
- has_siblings = cpu_has_ht && smp_num_siblings > 1;
-#endif
- return has_siblings;
+ return cpu_has_ht && smp_num_siblings > 1;
}

DECLARE_PER_CPU(int, cpu_number);
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 58438a1b..7250ad1 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -174,11 +174,9 @@ static inline void arch_fix_phys_package_id(int num, u32 slot)
struct pci_bus;
void x86_pci_root_bus_resources(int bus, struct list_head *resources);

-#ifdef CONFIG_SMP
#define mc_capable() ((boot_cpu_data.x86_max_cores > 1) && \
(cpumask_weight(&boot_cpu_data.core_map) != nr_cpu_ids))
-#define smt_capable() (smp_num_siblings > 1)
-#endif
+#define smt_capable() (smp_num_siblings > 1)

#ifdef CONFIG_NUMA
extern int get_mp_bus_to_node(int busnum);
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 1cd9d51..a8b46df 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -263,7 +263,6 @@ static int __cpuinit nearby_node(int apicid)
* Assumption: Number of cores in each internal node is the same.
* (2) AMD processors supporting compute units
*/
-#ifdef CONFIG_X86_HT
static void __cpuinit amd_get_topology(struct cpuinfo_x86 *c)
{
u32 nodes, cores_per_cu = 1;
@@ -307,7 +306,6 @@ static void __cpuinit amd_get_topology(struct cpuinfo_x86 *c)
c->compute_unit_id %= cus_per_node;
}
}
-#endif

/*
* On a AMD dual core setup the lower bits of the APIC id distingush the cores.
@@ -315,7 +313,6 @@ static void __cpuinit amd_get_topology(struct cpuinfo_x86 *c)
*/
static void __cpuinit amd_detect_cmp(struct cpuinfo_x86 *c)
{
-#ifdef CONFIG_X86_HT
unsigned bits;

bits = c->x86_coreid_bits;
@@ -326,7 +323,6 @@ static void __cpuinit amd_detect_cmp(struct cpuinfo_x86 *c)
/* use socket ID also for last level cache */
c->llc_id = c->phys_proc_id;
amd_get_topology(c);
-#endif
}

int amd_get_nb_id(int cpu)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index ad2a148..8343f54 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -48,6 +48,10 @@ cpumask_var_t cpu_initialized_mask;
cpumask_var_t cpu_callout_mask;
cpumask_var_t cpu_callin_mask;

+/* Number of siblings per CPU package */
+int smp_num_siblings = 1;
+EXPORT_SYMBOL(smp_num_siblings);
+
/* representing cpus for which sibling maps can be computed */
cpumask_var_t cpu_sibling_setup_mask;

@@ -453,7 +457,6 @@ void __cpuinit cpu_detect_cache_sizes(struct cpuinfo_x86 *c)

void __cpuinit detect_ht(struct cpuinfo_x86 *c)
{
-#ifdef CONFIG_X86_HT
u32 eax, ebx, ecx, edx;
int index_msb, core_bits;
static bool printed;
@@ -499,7 +502,6 @@ out:
c->cpu_core_id);
printed = 1;
}
-#endif
}

static void __cpuinit get_cpu_vendor(struct cpuinfo_x86 *c)
diff --git a/arch/x86/kernel/cpu/perf_event_p4.c b/arch/x86/kernel/cpu/perf_event_p4.c
index ef484d9..9d1413d 100644
--- a/arch/x86/kernel/cpu/perf_event_p4.c
+++ b/arch/x86/kernel/cpu/perf_event_p4.c
@@ -775,7 +775,7 @@ static int p4_validate_raw_event(struct perf_event *event)
* if an event is shared across the logical threads
* the user needs special permissions to be able to use it
*/
- if (p4_ht_active() && p4_event_bind_map[v].shared) {
+ if (smt_capable() && p4_event_bind_map[v].shared) {
if (perf_paranoid_cpu() && !capable(CAP_SYS_ADMIN))
return -EACCES;
}
@@ -816,7 +816,7 @@ static int p4_hw_config(struct perf_event *event)
event->hw.config = p4_config_pack_escr(escr) |
p4_config_pack_cccr(cccr);

- if (p4_ht_active() && p4_ht_thread(cpu))
+ if (smt_capable() && p4_ht_thread(cpu))
event->hw.config = p4_set_ht_bit(event->hw.config);

if (event->attr.type == PERF_TYPE_RAW) {
diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
index e6e07c2..aef8b27 100644
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -1,16 +1,16 @@
-#include <linux/smp.h>
#include <linux/timex.h>
#include <linux/string.h>
#include <linux/seq_file.h>
#include <linux/cpufreq.h>

+#include <asm/smp.h>
+
/*
* Get CPU information for use by the procfs.
*/
static void show_cpuinfo_core(struct seq_file *m, struct cpuinfo_x86 *c,
unsigned int cpu)
{
-#ifdef CONFIG_SMP
if (c->x86_max_cores * smp_num_siblings > 1) {
seq_printf(m, "physical id\t: %d\n", c->phys_proc_id);
seq_printf(m, "siblings\t: %d\n", cpumask_weight(&c->core_map));
@@ -19,7 +19,6 @@ static void show_cpuinfo_core(struct seq_file *m, struct cpuinfo_x86 *c,
seq_printf(m, "apicid\t\t: %d\n", c->apicid);
seq_printf(m, "initial apicid\t: %d\n", c->initial_apicid);
}
-#endif
}

#ifdef CONFIG_X86_32
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 4397e98..d4ee471 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -28,7 +28,6 @@
*/
void __cpuinit detect_extended_topology(struct cpuinfo_x86 *c)
{
-#ifdef CONFIG_SMP
unsigned int eax, ebx, ecx, edx, sub_index;
unsigned int ht_mask_width, core_plus_mask_width;
unsigned int core_select_mask, core_level_siblings;
@@ -95,5 +94,4 @@ void __cpuinit detect_extended_topology(struct cpuinfo_x86 *c)
printed = 1;
}
return;
-#endif
}
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 14baf78..c992254 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -587,12 +587,11 @@ static void amd_e400_idle(void)

void __cpuinit select_idle_routine(const struct cpuinfo_x86 *c)
{
-#ifdef CONFIG_SMP
if (pm_idle == poll_idle && smp_num_siblings > 1) {
printk_once(KERN_WARNING "WARNING: polling idle and HT enabled,"
" performance may degrade.\n");
}
-#endif
+
if (pm_idle)
return;

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 3a4908d..4c5a5e5 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -112,10 +112,6 @@ static struct task_struct *idle_thread_array[NR_CPUS] __cpuinitdata ;
#define set_idle_for_cpu(x, p) (idle_thread_array[(x)] = (p))
#endif

-/* Number of siblings per CPU package */
-int smp_num_siblings = 1;
-EXPORT_SYMBOL(smp_num_siblings);
-
/* Per CPU bogomips and other parameters */
DEFINE_PER_CPU_SHARED_ALIGNED(struct cpuinfo_x86, cpu_info);
EXPORT_PER_CPU_SYMBOL(cpu_info);
diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
index 26b8a85..346e7ac 100644
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -572,11 +572,6 @@ static int __init p4_init(char **cpu_type)
if (cpu_model > 6 || cpu_model == 5)
return 0;

-#ifndef CONFIG_SMP
- *cpu_type = "i386/p4";
- model = &op_p4_spec;
- return 1;
-#else
switch (smp_num_siblings) {
case 1:
*cpu_type = "i386/p4";
@@ -588,7 +583,6 @@ static int __init p4_init(char **cpu_type)
model = &op_p4_ht2_spec;
return 1;
}
-#endif

printk(KERN_INFO "oprofile: P4 HyperThreading detected with > 2 threads\n");
printk(KERN_INFO "oprofile: Reverting to timer mode.\n");
diff --git a/arch/x86/oprofile/op_model_p4.c b/arch/x86/oprofile/op_model_p4.c
index ae3503e..c6bcb22 100644
--- a/arch/x86/oprofile/op_model_p4.c
+++ b/arch/x86/oprofile/op_model_p4.c
@@ -42,21 +42,15 @@ static unsigned int num_controls = NUM_CONTROLS_NON_HT;
kernel boot-time. */
static inline void setup_num_counters(void)
{
-#ifdef CONFIG_SMP
if (smp_num_siblings == 2) {
num_counters = NUM_COUNTERS_HT2;
num_controls = NUM_CONTROLS_HT2;
}
-#endif
}

static inline int addr_increment(void)
{
-#ifdef CONFIG_SMP
return smp_num_siblings == 2 ? 2 : 1;
-#else
- return 1;
-#endif
}


--
1.7.9

2012-02-22 23:33:54

by Kevin Winchester

[permalink] [raw]
Subject: [PATCH v3 2/5] x86: Move per cpu cpu_llc_id to a field in struct cpuinfo_x86

This simplifies the various code paths using this field as it
groups the per-cpu data together.

Acked-by: Borislav Petkov <[email protected]>
Signed-off-by: Kevin Winchester <[email protected]>
---
arch/x86/include/asm/processor.h | 1 +
arch/x86/include/asm/smp.h | 1 -
arch/x86/kernel/apic/apic_numachip.c | 2 +-
arch/x86/kernel/cpu/amd.c | 14 ++++----------
arch/x86/kernel/cpu/common.c | 1 +
arch/x86/kernel/cpu/intel_cacheinfo.c | 11 ++---------
arch/x86/kernel/smpboot.c | 18 ++++++++----------
7 files changed, 17 insertions(+), 31 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 9fe3c5e..2d304f9 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -112,6 +112,7 @@ struct cpuinfo_x86 {
u32 microcode;
/* CPUs sharing the last level cache: */
cpumask_t llc_shared_map;
+ u16 llc_id;
} __attribute__((__aligned__(SMP_CACHE_BYTES)));

#define X86_VENDOR_INTEL 0
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 61ebe324..40d1c96 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -33,7 +33,6 @@ static inline bool cpu_has_ht_siblings(void)

DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map);
DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
-DECLARE_PER_CPU(u16, cpu_llc_id);
DECLARE_PER_CPU(int, cpu_number);

static inline struct cpumask *cpu_sibling_mask(int cpu)
diff --git a/arch/x86/kernel/apic/apic_numachip.c b/arch/x86/kernel/apic/apic_numachip.c
index 09d3d8c..73c46cf 100644
--- a/arch/x86/kernel/apic/apic_numachip.c
+++ b/arch/x86/kernel/apic/apic_numachip.c
@@ -202,7 +202,7 @@ static void __init map_csrs(void)
static void fixup_cpu_id(struct cpuinfo_x86 *c, int node)
{
c->phys_proc_id = node;
- per_cpu(cpu_llc_id, smp_processor_id()) = node;
+ c->llc_id = node;
}

static int __init numachip_system_init(void)
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 0a44b90..1cd9d51 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -268,7 +268,6 @@ static void __cpuinit amd_get_topology(struct cpuinfo_x86 *c)
{
u32 nodes, cores_per_cu = 1;
u8 node_id;
- int cpu = smp_processor_id();

/* get information required for multi-node processors */
if (cpu_has(c, X86_FEATURE_TOPOEXT)) {
@@ -301,7 +300,7 @@ static void __cpuinit amd_get_topology(struct cpuinfo_x86 *c)
cus_per_node = cores_per_node / cores_per_cu;

/* store NodeID, use llc_shared_map to store sibling info */
- per_cpu(cpu_llc_id, cpu) = node_id;
+ c->llc_id = node_id;

/* core id has to be in the [0 .. cores_per_node - 1] range */
c->cpu_core_id %= cores_per_node;
@@ -318,7 +317,6 @@ static void __cpuinit amd_detect_cmp(struct cpuinfo_x86 *c)
{
#ifdef CONFIG_X86_HT
unsigned bits;
- int cpu = smp_processor_id();

bits = c->x86_coreid_bits;
/* Low order bits define the core id (index of core in socket) */
@@ -326,18 +324,14 @@ static void __cpuinit amd_detect_cmp(struct cpuinfo_x86 *c)
/* Convert the initial APIC ID into the socket ID */
c->phys_proc_id = c->initial_apicid >> bits;
/* use socket ID also for last level cache */
- per_cpu(cpu_llc_id, cpu) = c->phys_proc_id;
+ c->llc_id = c->phys_proc_id;
amd_get_topology(c);
#endif
}

int amd_get_nb_id(int cpu)
{
- int id = 0;
-#ifdef CONFIG_SMP
- id = per_cpu(cpu_llc_id, cpu);
-#endif
- return id;
+ return cpu_data(cpu).llc_id;
}
EXPORT_SYMBOL_GPL(amd_get_nb_id);

@@ -350,7 +344,7 @@ static void __cpuinit srat_detect_node(struct cpuinfo_x86 *c)

node = numa_cpu_node(cpu);
if (node == NUMA_NO_NODE)
- node = per_cpu(cpu_llc_id, cpu);
+ node = c->llc_id;

/*
* If core numbers are inconsistent, it's likely a multi-fabric platform,
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index ade9c79..ad2a148 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -788,6 +788,7 @@ static void __cpuinit identify_cpu(struct cpuinfo_x86 *c)
c->x86_model_id[0] = '\0'; /* Unset */
c->x86_max_cores = 1;
c->x86_coreid_bits = 0;
+ c->llc_id = BAD_APICID;
#ifdef CONFIG_X86_64
c->x86_clflush_size = 64;
c->x86_phys_bits = 36;
diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c
index a9cd551..5ddd6ef 100644
--- a/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -579,9 +579,6 @@ unsigned int __cpuinit init_intel_cacheinfo(struct cpuinfo_x86 *c)
unsigned int new_l1d = 0, new_l1i = 0; /* Cache sizes from cpuid(4) */
unsigned int new_l2 = 0, new_l3 = 0, i; /* Cache sizes from cpuid(4) */
unsigned int l2_id = 0, l3_id = 0, num_threads_sharing, index_msb;
-#ifdef CONFIG_X86_HT
- unsigned int cpu = c->cpu_index;
-#endif

if (c->cpuid_level > 3) {
static int is_initialized;
@@ -700,16 +697,12 @@ unsigned int __cpuinit init_intel_cacheinfo(struct cpuinfo_x86 *c)

if (new_l2) {
l2 = new_l2;
-#ifdef CONFIG_X86_HT
- per_cpu(cpu_llc_id, cpu) = l2_id;
-#endif
+ c->llc_id = l2_id;
}

if (new_l3) {
l3 = new_l3;
-#ifdef CONFIG_X86_HT
- per_cpu(cpu_llc_id, cpu) = l3_id;
-#endif
+ c->llc_id = l3_id;
}

c->x86_cache_size = l3 ? l3 : (l2 ? l2 : (l1i+l1d));
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index b988c13..3210646 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -116,9 +116,6 @@ static struct task_struct *idle_thread_array[NR_CPUS] __cpuinitdata ;
int smp_num_siblings = 1;
EXPORT_SYMBOL(smp_num_siblings);

-/* Last level cache ID of each logical CPU */
-DEFINE_PER_CPU(u16, cpu_llc_id) = BAD_APICID;
-
/* representing HT siblings of each logical CPU */
DEFINE_PER_CPU(cpumask_var_t, cpu_sibling_map);
EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
@@ -353,7 +350,7 @@ void __cpuinit set_cpu_sibling_map(int cpu)

if (cpu_has(c, X86_FEATURE_TOPOEXT)) {
if (c->phys_proc_id == o->phys_proc_id &&
- per_cpu(cpu_llc_id, cpu) == per_cpu(cpu_llc_id, i) &&
+ c->llc_id == o->llc_id &&
c->compute_unit_id == o->compute_unit_id)
link_thread_siblings(cpu, i);
} else if (c->phys_proc_id == o->phys_proc_id &&
@@ -374,12 +371,13 @@ void __cpuinit set_cpu_sibling_map(int cpu)
}

for_each_cpu(i, cpu_sibling_setup_mask) {
- if (per_cpu(cpu_llc_id, cpu) != BAD_APICID &&
- per_cpu(cpu_llc_id, cpu) == per_cpu(cpu_llc_id, i)) {
+ struct cpuinfo_x86 *o = &cpu_data(i);
+
+ if (c->llc_id != BAD_APICID && c->llc_id == o->llc_id) {
cpumask_set_cpu(i, &c->llc_shared_map);
- cpumask_set_cpu(cpu, &cpu_data(i).llc_shared_map);
+ cpumask_set_cpu(cpu, &o->llc_shared_map);
}
- if (c->phys_proc_id == cpu_data(i).phys_proc_id) {
+ if (c->phys_proc_id == o->phys_proc_id) {
cpumask_set_cpu(i, cpu_core_mask(cpu));
cpumask_set_cpu(cpu, cpu_core_mask(i));
/*
@@ -397,9 +395,9 @@ void __cpuinit set_cpu_sibling_map(int cpu)
* the other cpus in this package
*/
if (i != cpu)
- cpu_data(i).booted_cores++;
+ o->booted_cores++;
} else if (i != cpu && !c->booted_cores)
- c->booted_cores = cpu_data(i).booted_cores;
+ c->booted_cores = o->booted_cores;
}
}
}
--
1.7.9

2012-02-22 23:44:01

by Kevin Winchester

[permalink] [raw]
Subject: Re: [PATCH v3 0/5] x86: Cleanup and simplify cpu-specific data

On 22 February 2012 19:32, Kevin Winchester <[email protected]> wrote:
> Kevin Winchester (5):
> ?x86: Move per cpu cpu_llc_shared_map to a field in struct cpuinfo_x86
> ?x86: Move per cpu cpu_llc_id to a field in struct cpuinfo_x86
> ?x86: Move per cpu cpu_sibling_map to a field in struct cpuinfo_x86
> ?x86: Move per cpu cpu_core_map to a field in struct cpuinfo_x86
> ?x86: Remove #ifdef CONFIG_SMP sections by moving smp_num_siblings
> ? ?into common.c
>
> ?arch/x86/include/asm/perf_event_p4.h ?| ? 14 +----
> ?arch/x86/include/asm/processor.h ? ? ?| ? 10 ++++
> ?arch/x86/include/asm/smp.h ? ? ? ? ? ?| ? 26 +---------
> ?arch/x86/include/asm/topology.h ? ? ? | ? 10 ++--
> ?arch/x86/kernel/apic/apic_numachip.c ?| ? ?2 +-
> ?arch/x86/kernel/cpu/amd.c ? ? ? ? ? ? | ? 18 ++-----
> ?arch/x86/kernel/cpu/common.c ? ? ? ? ?| ? ?7 ++-
> ?arch/x86/kernel/cpu/intel_cacheinfo.c | ? 19 ++-----
> ?arch/x86/kernel/cpu/mcheck/mce_amd.c ?| ? ?9 ++--
> ?arch/x86/kernel/cpu/perf_event_p4.c ? | ? ?4 +-
> ?arch/x86/kernel/cpu/proc.c ? ? ? ? ? ?| ? ?8 +--
> ?arch/x86/kernel/cpu/topology.c ? ? ? ?| ? ?2 -
> ?arch/x86/kernel/process.c ? ? ? ? ? ? | ? ?3 +-
> ?arch/x86/kernel/smpboot.c ? ? ? ? ? ? | ? 95 +++++++++++++--------------------
> ?arch/x86/oprofile/nmi_int.c ? ? ? ? ? | ? ?6 --
> ?arch/x86/oprofile/op_model_p4.c ? ? ? | ? 11 +----
> ?arch/x86/xen/smp.c ? ? ? ? ? ? ? ? ? ?| ? ?6 --
> ?drivers/cpufreq/acpi-cpufreq.c ? ? ? ?| ? ?2 +-
> ?drivers/cpufreq/p4-clockmod.c ? ? ? ? | ? ?4 +-
> ?drivers/cpufreq/powernow-k8.c ? ? ? ? | ? 13 +----
> ?drivers/cpufreq/speedstep-ich.c ? ? ? | ? ?6 +-
> ?drivers/hwmon/coretemp.c ? ? ? ? ? ? ?| ? ?6 +--
> ?22 files changed, 91 insertions(+), 190 deletions(-)
>

Please ignore this posting, I rushed it and did not build all
combinations. It looks like a new user of cpu_core_mask appeared in
tip since my last rebase.

I'll send the series out again once I have confirmed that everything is correct.

--
Kevin Winchester

2012-02-23 07:33:09

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v3 0/5] x86: Cleanup and simplify cpu-specific data


* Kevin Winchester <[email protected]> wrote:

> 22 files changed, 91 insertions(+), 190 deletions(-)

Btw., the diffstat is very nice - the individual patches only
hinted at the simplification effect - the combined diffstat
shows that we saved a hundred lines of often rather complex code
and made this code quite a bit more hackable.

Thanks,

Ingo