LinuxLists.cc - [PATCH v4 0/5] x86: Cleanup and simplify cpu-specific data

2012-02-23 23:58:12

Subject: [PATCH v4 0/5] x86: Cleanup and simplify cpu-specific data

Various per-cpu fields are define in arch/x86/kernel/smpboot.c that are
basically equivalent to the cpu-specific data in struct cpuinfo_x86.
By moving these fields into the structure, a number of codepaths can be
simplified since they no longer need to care about those fields not
existing on !SMP builds.

The size effects on allno (UP) and allyes (MAX_SMP) kernels are as
follows:

text data bss dec hex filename
1586721 304864 506208 2397793 249661 vmlinux.allno
1588517 304928 505920 2399365 249c85 vmlinux.allno.after
84706053 13212311 42434560 140352924 85d9d9c vmlinux.allyes
84705333 13213799 42434560 140353692 85da09c vmlinux.allyes.afte

As can be seen, the kernels get slighly larger, but the code reduction/
simplification should be enough to compensate for it.

Kevin Winchester (5):
x86: Move per cpu cpu_llc_shared_map to a field in struct cpuinfo_x86
x86: Move per cpu cpu_llc_id to a field in struct cpuinfo_x86
x86: Move per cpu cpu_sibling_map to a field in struct cpuinfo_x86
x86: Move per cpu cpu_core_map to a field in struct cpuinfo_x86
x86: Remove #ifdef CONFIG_SMP sections by moving smp_num_siblings
into common.c

arch/x86/include/asm/perf_event_p4.h | 14 +----
arch/x86/include/asm/processor.h | 10 ++++
arch/x86/include/asm/smp.h | 26 +---------
arch/x86/include/asm/topology.h | 10 ++--
arch/x86/kernel/apic/apic_numachip.c | 2 +-
arch/x86/kernel/cpu/amd.c | 18 ++-----
arch/x86/kernel/cpu/common.c | 7 ++-
arch/x86/kernel/cpu/intel_cacheinfo.c | 19 ++-----
arch/x86/kernel/cpu/mcheck/mce_amd.c | 9 ++--
arch/x86/kernel/cpu/perf_event_p4.c | 4 +-
arch/x86/kernel/cpu/proc.c | 8 +--
arch/x86/kernel/cpu/topology.c | 2 -
arch/x86/kernel/process.c | 3 +-
arch/x86/kernel/smpboot.c | 95 +++++++++++++--------------------
arch/x86/kernel/tsc_sync.c | 2 +-
arch/x86/oprofile/nmi_int.c | 6 --
arch/x86/oprofile/op_model_p4.c | 11 +----
arch/x86/xen/smp.c | 6 --
drivers/cpufreq/acpi-cpufreq.c | 2 +-
drivers/cpufreq/p4-clockmod.c | 4 +-
drivers/cpufreq/powernow-k8.c | 13 +----
drivers/cpufreq/speedstep-ich.c | 6 +-
drivers/hwmon/coretemp.c | 6 +--
23 files changed, 92 insertions(+), 191 deletions(-)

--
1.7.9.1

2012-02-23 23:58:14

by Kevin Winchester

[permalink] [raw]

Subject: [PATCH v4 1/5] x86: Move per cpu cpu_llc_shared_map to a field in struct cpuinfo_x86

Commit 141168c36cde ("x86: Simplify code by removing a !SMP #ifdefs from
'struct cpuinfo_x86'") caused the compilation error:

mce_amd.c:(.cpuinit.text+0x4723): undefined reference to 'cpu_llc_shared_map'

by removing an #ifdef CONFIG_SMP around a block containing a reference
to cpu_llc_shared_map. Rather than replace the #ifdef, move
cpu_llc_shared_map to be a new cpumask_t field llc_shared_map in
struct cpuinfo_x86 and adjust all references to cpu_llc_shared_map.

The size effects on various kernels are as follows:

text data bss dec hex filename
5281572 513296 1044480 6839348 685c34 vmlinux.up
5281572 513296 1044480 6839348 685c34 vmlinux.up.patched
5548860 516792 1110016 7175668 6d7df4 vmlinux.smp.2
5548837 516792 1110016 7175645 6d7ddd vmlinux.smp.2.patched
5595965 706840 1310720 7613525 742c55 vmlinux.smp.max
5595876 707880 1310720 7614476 74300c vmlinux.smp.max.patched

It can be seen that this change has no effect on UP, a minor effect for
SMP with Max 2 CPUs, and a more substantial but still not overly large
effect for MAXSMP.

Signed-off-by: Kevin Winchester <[email protected]>
---
arch/x86/include/asm/processor.h | 2 ++
arch/x86/include/asm/smp.h | 7 -------
arch/x86/kernel/cpu/intel_cacheinfo.c | 4 ++--
arch/x86/kernel/cpu/mcheck/mce_amd.c | 9 ++++-----
arch/x86/kernel/smpboot.c | 15 ++++++---------
arch/x86/xen/smp.c | 1 -
6 files changed, 14 insertions(+), 24 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index c59ff02..9fe3c5e 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -110,6 +110,8 @@ struct cpuinfo_x86 {
/* Index into per_cpu list: */
u16 cpu_index;
u32 microcode;
+ /* CPUs sharing the last level cache: */
+ cpumask_t llc_shared_map;
} __attribute__((__aligned__(SMP_CACHE_BYTES)));

#define X86_VENDOR_INTEL 0
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 0434c40..61ebe324 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -33,8 +33,6 @@ static inline bool cpu_has_ht_siblings(void)

DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map);
DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
-/* cpus sharing the last level cache: */
-DECLARE_PER_CPU(cpumask_var_t, cpu_llc_shared_map);
DECLARE_PER_CPU(u16, cpu_llc_id);
DECLARE_PER_CPU(int, cpu_number);

@@ -48,11 +46,6 @@ static inline struct cpumask *cpu_core_mask(int cpu)
return per_cpu(cpu_core_map, cpu);
}

-static inline struct cpumask *cpu_llc_shared_mask(int cpu)
-{
- return per_cpu(cpu_llc_shared_map, cpu);
-}
-
DECLARE_EARLY_PER_CPU(u16, x86_cpu_to_apicid);
DECLARE_EARLY_PER_CPU(u16, x86_bios_cpu_apicid);
#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86_32)
diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c
index 73d08ed..a9cd551 100644
--- a/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -734,11 +734,11 @@ static int __cpuinit cache_shared_amd_cpu_map_setup(unsigned int cpu, int index)
ret = 0;
if (index == 3) {
ret = 1;
- for_each_cpu(i, cpu_llc_shared_mask(cpu)) {
+ for_each_cpu(i, &c->llc_shared_map) {
if (!per_cpu(ici_cpuid4_info, i))
continue;
this_leaf = CPUID4_INFO_IDX(i, index);
- for_each_cpu(sibling, cpu_llc_shared_mask(cpu)) {
+ for_each_cpu(sibling, &c->llc_shared_map) {
if (!cpu_online(sibling))
continue;
set_bit(sibling, this_leaf->shared_cpu_map);
diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index e4eeaaf..5e0ec2c 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -525,12 +525,12 @@ static __cpuinit int threshold_create_bank(unsigned int cpu, unsigned int bank)
struct threshold_bank *b = NULL;
struct device *dev = mce_device[cpu];
char name[32];
+ struct cpuinfo_x86 *c = &cpu_data(cpu);

sprintf(name, "threshold_bank%i", bank);

-#ifdef CONFIG_SMP
- if (cpu_data(cpu).cpu_core_id && shared_bank[bank]) { /* symlink */
- i = cpumask_first(cpu_llc_shared_mask(cpu));
+ if (c->cpu_core_id && shared_bank[bank]) { /* symlink */
+ i = cpumask_first(&c->llc_shared_map);

/* first core not up yet */
if (cpu_data(i).cpu_core_id)
@@ -549,12 +549,11 @@ static __cpuinit int threshold_create_bank(unsigned int cpu, unsigned int bank)
if (err)
goto out;

- cpumask_copy(b->cpus, cpu_llc_shared_mask(cpu));
+ cpumask_copy(b->cpus, &c->llc_shared_map);
per_cpu(threshold_banks, cpu)[bank] = b;

goto out;
}
-#endif

b = kzalloc(sizeof(struct threshold_bank), GFP_KERNEL);
if (!b) {
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index f34f8b2..b988c13 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -127,8 +127,6 @@ EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
DEFINE_PER_CPU(cpumask_var_t, cpu_core_map);
EXPORT_PER_CPU_SYMBOL(cpu_core_map);

-DEFINE_PER_CPU(cpumask_var_t, cpu_llc_shared_map);
-
/* Per CPU bogomips and other parameters */
DEFINE_PER_CPU_SHARED_ALIGNED(struct cpuinfo_x86, cpu_info);
EXPORT_PER_CPU_SYMBOL(cpu_info);
@@ -337,8 +335,8 @@ static void __cpuinit link_thread_siblings(int cpu1, int cpu2)
cpumask_set_cpu(cpu2, cpu_sibling_mask(cpu1));
cpumask_set_cpu(cpu1, cpu_core_mask(cpu2));
cpumask_set_cpu(cpu2, cpu_core_mask(cpu1));
- cpumask_set_cpu(cpu1, cpu_llc_shared_mask(cpu2));
- cpumask_set_cpu(cpu2, cpu_llc_shared_mask(cpu1));
+ cpumask_set_cpu(cpu1, &cpu_data(cpu2).llc_shared_map);
+ cpumask_set_cpu(cpu2, &cpu_data(cpu1).llc_shared_map);
}

@@ -367,7 +365,7 @@ void __cpuinit set_cpu_sibling_map(int cpu)
cpumask_set_cpu(cpu, cpu_sibling_mask(cpu));
}

- cpumask_set_cpu(cpu, cpu_llc_shared_mask(cpu));
+ cpumask_set_cpu(cpu, &c->llc_shared_map);

if (__this_cpu_read(cpu_info.x86_max_cores) == 1) {
cpumask_copy(cpu_core_mask(cpu), cpu_sibling_mask(cpu));
@@ -378,8 +376,8 @@ void __cpuinit set_cpu_sibling_map(int cpu)
for_each_cpu(i, cpu_sibling_setup_mask) {
if (per_cpu(cpu_llc_id, cpu) != BAD_APICID &&
per_cpu(cpu_llc_id, cpu) == per_cpu(cpu_llc_id, i)) {
- cpumask_set_cpu(i, cpu_llc_shared_mask(cpu));
- cpumask_set_cpu(cpu, cpu_llc_shared_mask(i));
+ cpumask_set_cpu(i, &c->llc_shared_map);
+ cpumask_set_cpu(cpu, &cpu_data(i).llc_shared_map);
}
if (c->phys_proc_id == cpu_data(i).phys_proc_id) {
cpumask_set_cpu(i, cpu_core_mask(cpu));
@@ -418,7 +416,7 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
!(cpu_has(c, X86_FEATURE_AMD_DCM)))
return cpu_core_mask(cpu);
else
- return cpu_llc_shared_mask(cpu);
+ return &c->llc_shared_map;
}

static void impress_friends(void)
@@ -1052,7 +1050,6 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
for_each_possible_cpu(i) {
zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
- zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), GFP_KERNEL);
}
set_cpu_sibling_map(0);

diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 501d4e0..b9f7a86 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -225,7 +225,6 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
for_each_possible_cpu(i) {
zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
- zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), GFP_KERNEL);
}
set_cpu_sibling_map(0);

--
1.7.9.1

2012-02-23 23:58:18

by Kevin Winchester

[permalink] [raw]

Subject: [PATCH v4 3/5] x86: Move per cpu cpu_sibling_map to a field in struct cpuinfo_x86

This simplifies the various code paths using this field as it
groups the per-cpu data together.

Acked-by: Borislav Petkov <[email protected]>
Signed-off-by: Kevin Winchester <[email protected]>
---
arch/x86/include/asm/perf_event_p4.h | 2 +-
arch/x86/include/asm/processor.h | 2 ++
arch/x86/include/asm/smp.h | 6 ------
arch/x86/include/asm/topology.h | 2 +-
arch/x86/kernel/cpu/intel_cacheinfo.c | 4 ++--
arch/x86/kernel/smpboot.c | 27 +++++++++++----------------
arch/x86/oprofile/op_model_p4.c | 5 +----
arch/x86/xen/smp.c | 1 -
drivers/cpufreq/p4-clockmod.c | 4 +---
drivers/cpufreq/speedstep-ich.c | 6 +++---
drivers/hwmon/coretemp.c | 6 +-----
11 files changed, 23 insertions(+), 42 deletions(-)

diff --git a/arch/x86/include/asm/perf_event_p4.h b/arch/x86/include/asm/perf_event_p4.h
index 4f7e67e..29a65c2 100644
--- a/arch/x86/include/asm/perf_event_p4.h
+++ b/arch/x86/include/asm/perf_event_p4.h
@@ -189,7 +189,7 @@ static inline int p4_ht_thread(int cpu)
{
#ifdef CONFIG_SMP
if (smp_num_siblings == 2)
- return cpu != cpumask_first(__get_cpu_var(cpu_sibling_map));
+ return cpu != cpumask_first(&cpu_data(cpu).sibling_map));
#endif
return 0;
}
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 2d304f9..a3fce4e 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -113,6 +113,8 @@ struct cpuinfo_x86 {
/* CPUs sharing the last level cache: */
cpumask_t llc_shared_map;
u16 llc_id;
+ /* representing HT siblings of each logical CPU */
+ cpumask_t sibling_map;
} __attribute__((__aligned__(SMP_CACHE_BYTES)));

#define X86_VENDOR_INTEL 0
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 40d1c96..b5e7cd2 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -31,15 +31,9 @@ static inline bool cpu_has_ht_siblings(void)
return has_siblings;
}

-DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map);
DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
DECLARE_PER_CPU(int, cpu_number);

-static inline struct cpumask *cpu_sibling_mask(int cpu)
-{
- return per_cpu(cpu_sibling_map, cpu);
-}
-
static inline struct cpumask *cpu_core_mask(int cpu)
{
return per_cpu(cpu_core_map, cpu);
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index b9676ae..5297acbf 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -161,7 +161,7 @@ extern const struct cpumask *cpu_coregroup_mask(int cpu);
#define topology_physical_package_id(cpu) (cpu_data(cpu).phys_proc_id)
#define topology_core_id(cpu) (cpu_data(cpu).cpu_core_id)
#define topology_core_cpumask(cpu) (per_cpu(cpu_core_map, cpu))
-#define topology_thread_cpumask(cpu) (per_cpu(cpu_sibling_map, cpu))
+#define topology_thread_cpumask(cpu) (&cpu_data(cpu).sibling_map)

/* indicates that pointers to the topology cpumask_t maps are valid */
#define arch_provides_topology_pointers yes
diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c
index 5ddd6ef..7787d33 100644
--- a/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -739,11 +739,11 @@ static int __cpuinit cache_shared_amd_cpu_map_setup(unsigned int cpu, int index)
}
} else if ((c->x86 == 0x15) && ((index == 1) || (index == 2))) {
ret = 1;
- for_each_cpu(i, cpu_sibling_mask(cpu)) {
+ for_each_cpu(i, &c->sibling_map) {
if (!per_cpu(ici_cpuid4_info, i))
continue;
this_leaf = CPUID4_INFO_IDX(i, index);
- for_each_cpu(sibling, cpu_sibling_mask(cpu)) {
+ for_each_cpu(sibling, &c->sibling_map) {
if (!cpu_online(sibling))
continue;
set_bit(sibling, this_leaf->shared_cpu_map);
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 3210646..7e73ea7 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -116,10 +116,6 @@ static struct task_struct *idle_thread_array[NR_CPUS] __cpuinitdata ;
int smp_num_siblings = 1;
EXPORT_SYMBOL(smp_num_siblings);

-/* representing HT siblings of each logical CPU */
-DEFINE_PER_CPU(cpumask_var_t, cpu_sibling_map);
-EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
-
/* representing HT and core siblings of each logical CPU */
DEFINE_PER_CPU(cpumask_var_t, cpu_core_map);
EXPORT_PER_CPU_SYMBOL(cpu_core_map);
@@ -328,8 +324,8 @@ void __cpuinit smp_store_cpu_info(int id)

static void __cpuinit link_thread_siblings(int cpu1, int cpu2)
{
- cpumask_set_cpu(cpu1, cpu_sibling_mask(cpu2));
- cpumask_set_cpu(cpu2, cpu_sibling_mask(cpu1));
+ cpumask_set_cpu(cpu1, &cpu_data(cpu2).sibling_map);
+ cpumask_set_cpu(cpu2, &cpu_data(cpu1).sibling_map);
cpumask_set_cpu(cpu1, cpu_core_mask(cpu2));
cpumask_set_cpu(cpu2, cpu_core_mask(cpu1));
cpumask_set_cpu(cpu1, &cpu_data(cpu2).llc_shared_map);
@@ -359,13 +355,13 @@ void __cpuinit set_cpu_sibling_map(int cpu)
}
}
} else {
- cpumask_set_cpu(cpu, cpu_sibling_mask(cpu));
+ cpumask_set_cpu(cpu, &c->sibling_map);
}

cpumask_set_cpu(cpu, &c->llc_shared_map);

if (__this_cpu_read(cpu_info.x86_max_cores) == 1) {
- cpumask_copy(cpu_core_mask(cpu), cpu_sibling_mask(cpu));
+ cpumask_copy(cpu_core_mask(cpu), &c->sibling_map);
c->booted_cores = 1;
return;
}
@@ -383,12 +379,12 @@ void __cpuinit set_cpu_sibling_map(int cpu)
/*
* Does this new cpu bringup a new core?
*/
- if (cpumask_weight(cpu_sibling_mask(cpu)) == 1) {
+ if (cpumask_weight(&c->sibling_map) == 1) {
/*
* for each core in package, increment
* the booted_cores for this new cpu
*/
- if (cpumask_first(cpu_sibling_mask(i)) == i)
+ if (cpumask_first(&o->sibling_map) == i)
c->booted_cores++;
/*
* increment the core count for all
@@ -908,7 +904,7 @@ static __init void disable_smp(void)
physid_set_mask_of_physid(boot_cpu_physical_apicid, &phys_cpu_present_map);
else
physid_set_mask_of_physid(0, &phys_cpu_present_map);
- cpumask_set_cpu(0, cpu_sibling_mask(0));
+ cpumask_set_cpu(0, &cpu_data(0).sibling_map);
cpumask_set_cpu(0, cpu_core_mask(0));
}

@@ -1046,7 +1042,6 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)

current_thread_info()->cpu = 0; /* needed? */
for_each_possible_cpu(i) {
- zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
}
set_cpu_sibling_map(0);
@@ -1241,13 +1236,13 @@ static void remove_siblinginfo(int cpu)
/*/
* last thread sibling in this cpu core going down
*/
- if (cpumask_weight(cpu_sibling_mask(cpu)) == 1)
+ if (cpumask_weight(&c->sibling_map) == 1)
cpu_data(sibling).booted_cores--;
}

- for_each_cpu(sibling, cpu_sibling_mask(cpu))
- cpumask_clear_cpu(cpu, cpu_sibling_mask(sibling));
- cpumask_clear(cpu_sibling_mask(cpu));
+ for_each_cpu(sibling, &c->sibling_map)
+ cpumask_clear_cpu(cpu, &c->sibling_map);
+ cpumask_clear(&c->sibling_map);
cpumask_clear(cpu_core_mask(cpu));
c->phys_proc_id = 0;
c->cpu_core_id = 0;
diff --git a/arch/x86/oprofile/op_model_p4.c b/arch/x86/oprofile/op_model_p4.c
index 98ab130..ae3503e 100644
--- a/arch/x86/oprofile/op_model_p4.c
+++ b/arch/x86/oprofile/op_model_p4.c
@@ -370,11 +370,8 @@ static struct p4_event_binding p4_events[NUM_EVENTS] = {
or "odd" part of all the divided resources. */
static unsigned int get_stagger(void)
{
-#ifdef CONFIG_SMP
int cpu = smp_processor_id();
- return cpu != cpumask_first(__get_cpu_var(cpu_sibling_map));
-#endif
- return 0;
+ return cpu != cpumask_first(&cpu_data(cpu).sibling_map);
}

diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index b9f7a86..00f32c0 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -223,7 +223,6 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
cpu_data(0).x86_max_cores = 1;

for_each_possible_cpu(i) {
- zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
}
set_cpu_sibling_map(0);
diff --git a/drivers/cpufreq/p4-clockmod.c b/drivers/cpufreq/p4-clockmod.c
index 6be3e07..a14b9b0 100644
--- a/drivers/cpufreq/p4-clockmod.c
+++ b/drivers/cpufreq/p4-clockmod.c
@@ -203,9 +203,7 @@ static int cpufreq_p4_cpu_init(struct cpufreq_policy *policy)
int cpuid = 0;
unsigned int i;

-#ifdef CONFIG_SMP
- cpumask_copy(policy->cpus, cpu_sibling_mask(policy->cpu));
-#endif
+ cpumask_copy(policy->cpus, &c->sibling_map);

/* Errata workaround */
cpuid = (c->x86 << 8) | (c->x86_model << 4) | c->x86_mask;
diff --git a/drivers/cpufreq/speedstep-ich.c b/drivers/cpufreq/speedstep-ich.c
index a748ce7..630926a 100644
--- a/drivers/cpufreq/speedstep-ich.c
+++ b/drivers/cpufreq/speedstep-ich.c
@@ -326,14 +326,14 @@ static void get_freqs_on_cpu(void *_get_freqs)

static int speedstep_cpu_init(struct cpufreq_policy *policy)
{
+ struct cpuinfo_x86 *c = &cpu_data(policy->cpu);
int result;
unsigned int policy_cpu, speed;
struct get_freqs gf;

/* only run on CPU to be set, or on its sibling */
-#ifdef CONFIG_SMP
- cpumask_copy(policy->cpus, cpu_sibling_mask(policy->cpu));
-#endif
+ cpumask_copy(policy->cpus, c->sibling_map);
+
policy_cpu = cpumask_any_and(policy->cpus, cpu_online_mask);

/* detect low and high frequency and transition latency */
diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index a6c6ec3..fdf1590 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -61,11 +61,7 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
#define TO_CORE_ID(cpu) cpu_data(cpu).cpu_core_id
#define TO_ATTR_NO(cpu) (TO_CORE_ID(cpu) + BASE_SYSFS_ATTR_NO)

-#ifdef CONFIG_SMP
-#define for_each_sibling(i, cpu) for_each_cpu(i, cpu_sibling_mask(cpu))
-#else
-#define for_each_sibling(i, cpu) for (i = 0; false; )
-#endif
+#define for_each_sibling(i, cpu) for_each_cpu(i, &cpu_data(cpu).sibling_map)

/*
* Per-Core Temperature Data
--
1.7.9.1

2012-02-23 23:58:22

by Kevin Winchester

[permalink] [raw]

Subject: [PATCH v4 5/5] x86: Remove #ifdef CONFIG_SMP sections by moving smp_num_siblings into common.c

smp_num_siblings was defined in arch/x86/kernel/smpboot.c, making it
necessary to wrap any UP relevant code referencing it with #ifdef
CONFIG_SMP.

Instead, move the definition to arch/x86/kernel/cpu/common.c, thus
making it available always.

Signed-off-by: Kevin Winchester <[email protected]>
---
arch/x86/include/asm/perf_event_p4.h | 14 +++-----------
arch/x86/include/asm/smp.h | 6 +-----
arch/x86/include/asm/topology.h | 4 +---
arch/x86/kernel/cpu/amd.c | 4 ----
arch/x86/kernel/cpu/common.c | 6 ++++--
arch/x86/kernel/cpu/perf_event_p4.c | 4 ++--
arch/x86/kernel/cpu/proc.c | 5 ++---
arch/x86/kernel/cpu/topology.c | 2 --
arch/x86/kernel/process.c | 3 +--
arch/x86/kernel/smpboot.c | 4 ----
arch/x86/oprofile/nmi_int.c | 6 ------
arch/x86/oprofile/op_model_p4.c | 6 ------
12 files changed, 14 insertions(+), 50 deletions(-)

diff --git a/arch/x86/include/asm/perf_event_p4.h b/arch/x86/include/asm/perf_event_p4.h
index 29a65c2..cfe41dc 100644
--- a/arch/x86/include/asm/perf_event_p4.h
+++ b/arch/x86/include/asm/perf_event_p4.h
@@ -8,6 +8,8 @@
#include <linux/cpu.h>
#include <linux/bitops.h>

+#include <asm/smp.h>
+
/*
* NetBurst has performance MSRs shared between
* threads if HT is turned on, ie for both logical
@@ -177,20 +179,10 @@ static inline u64 p4_clear_ht_bit(u64 config)
return config & ~P4_CONFIG_HT;
}

-static inline int p4_ht_active(void)
-{
-#ifdef CONFIG_SMP
- return smp_num_siblings > 1;
-#endif
- return 0;
-}
-
static inline int p4_ht_thread(int cpu)
{
-#ifdef CONFIG_SMP
if (smp_num_siblings == 2)
- return cpu != cpumask_first(&cpu_data(cpu).sibling_map));
-#endif
+ return cpu != cpumask_first(&cpu_data(cpu).sibling_map);
return 0;
}

diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 75aea4d..787127e 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -24,11 +24,7 @@ extern unsigned int num_processors;

static inline bool cpu_has_ht_siblings(void)
{
- bool has_siblings = false;
-#ifdef CONFIG_SMP
- has_siblings = cpu_has_ht && smp_num_siblings > 1;
-#endif
- return has_siblings;
+ return cpu_has_ht && smp_num_siblings > 1;
}

DECLARE_PER_CPU(int, cpu_number);
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 58438a1b..7250ad1 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -174,11 +174,9 @@ static inline void arch_fix_phys_package_id(int num, u32 slot)
struct pci_bus;
void x86_pci_root_bus_resources(int bus, struct list_head *resources);

-#ifdef CONFIG_SMP
#define mc_capable() ((boot_cpu_data.x86_max_cores > 1) && \
(cpumask_weight(&boot_cpu_data.core_map) != nr_cpu_ids))
-#define smt_capable() (smp_num_siblings > 1)
-#endif
+#define smt_capable() (smp_num_siblings > 1)

#ifdef CONFIG_NUMA
extern int get_mp_bus_to_node(int busnum);
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 1cd9d51..a8b46df 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -263,7 +263,6 @@ static int __cpuinit nearby_node(int apicid)
* Assumption: Number of cores in each internal node is the same.
* (2) AMD processors supporting compute units
*/
-#ifdef CONFIG_X86_HT
static void __cpuinit amd_get_topology(struct cpuinfo_x86 *c)
{
u32 nodes, cores_per_cu = 1;
@@ -307,7 +306,6 @@ static void __cpuinit amd_get_topology(struct cpuinfo_x86 *c)
c->compute_unit_id %= cus_per_node;
}
}
-#endif

/*
* On a AMD dual core setup the lower bits of the APIC id distingush the cores.
@@ -315,7 +313,6 @@ static void __cpuinit amd_get_topology(struct cpuinfo_x86 *c)
*/
static void __cpuinit amd_detect_cmp(struct cpuinfo_x86 *c)
{
-#ifdef CONFIG_X86_HT
unsigned bits;

bits = c->x86_coreid_bits;
@@ -326,7 +323,6 @@ static void __cpuinit amd_detect_cmp(struct cpuinfo_x86 *c)
/* use socket ID also for last level cache */
c->llc_id = c->phys_proc_id;
amd_get_topology(c);
-#endif
}

int amd_get_nb_id(int cpu)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index ad2a148..8343f54 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -48,6 +48,10 @@ cpumask_var_t cpu_initialized_mask;
cpumask_var_t cpu_callout_mask;
cpumask_var_t cpu_callin_mask;

+/* Number of siblings per CPU package */
+int smp_num_siblings = 1;
+EXPORT_SYMBOL(smp_num_siblings);
+
/* representing cpus for which sibling maps can be computed */
cpumask_var_t cpu_sibling_setup_mask;

@@ -453,7 +457,6 @@ void __cpuinit cpu_detect_cache_sizes(struct cpuinfo_x86 *c)

void __cpuinit detect_ht(struct cpuinfo_x86 *c)
{
-#ifdef CONFIG_X86_HT
u32 eax, ebx, ecx, edx;
int index_msb, core_bits;
static bool printed;
@@ -499,7 +502,6 @@ out:
c->cpu_core_id);
printed = 1;
}
-#endif
}

static void __cpuinit get_cpu_vendor(struct cpuinfo_x86 *c)
diff --git a/arch/x86/kernel/cpu/perf_event_p4.c b/arch/x86/kernel/cpu/perf_event_p4.c
index ef484d9..9d1413d 100644
--- a/arch/x86/kernel/cpu/perf_event_p4.c
+++ b/arch/x86/kernel/cpu/perf_event_p4.c
@@ -775,7 +775,7 @@ static int p4_validate_raw_event(struct perf_event *event)
* if an event is shared across the logical threads
* the user needs special permissions to be able to use it
*/
- if (p4_ht_active() && p4_event_bind_map[v].shared) {
+ if (smt_capable() && p4_event_bind_map[v].shared) {
if (perf_paranoid_cpu() && !capable(CAP_SYS_ADMIN))
return -EACCES;
}
@@ -816,7 +816,7 @@ static int p4_hw_config(struct perf_event *event)
event->hw.config = p4_config_pack_escr(escr) |
p4_config_pack_cccr(cccr);

- if (p4_ht_active() && p4_ht_thread(cpu))
+ if (smt_capable() && p4_ht_thread(cpu))
event->hw.config = p4_set_ht_bit(event->hw.config);

if (event->attr.type == PERF_TYPE_RAW) {
diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
index e6e07c2..aef8b27 100644
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -1,16 +1,16 @@
-#include <linux/smp.h>
#include <linux/timex.h>
#include <linux/string.h>
#include <linux/seq_file.h>
#include <linux/cpufreq.h>

+#include <asm/smp.h>
+
/*
* Get CPU information for use by the procfs.
*/
static void show_cpuinfo_core(struct seq_file *m, struct cpuinfo_x86 *c,
unsigned int cpu)
{
-#ifdef CONFIG_SMP
if (c->x86_max_cores * smp_num_siblings > 1) {
seq_printf(m, "physical id\t: %d\n", c->phys_proc_id);
seq_printf(m, "siblings\t: %d\n", cpumask_weight(&c->core_map));
@@ -19,7 +19,6 @@ static void show_cpuinfo_core(struct seq_file *m, struct cpuinfo_x86 *c,
seq_printf(m, "apicid\t\t: %d\n", c->apicid);
seq_printf(m, "initial apicid\t: %d\n", c->initial_apicid);
}
-#endif
}

#ifdef CONFIG_X86_32
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 4397e98..d4ee471 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -28,7 +28,6 @@
*/
void __cpuinit detect_extended_topology(struct cpuinfo_x86 *c)
{
-#ifdef CONFIG_SMP
unsigned int eax, ebx, ecx, edx, sub_index;
unsigned int ht_mask_width, core_plus_mask_width;
unsigned int core_select_mask, core_level_siblings;
@@ -95,5 +94,4 @@ void __cpuinit detect_extended_topology(struct cpuinfo_x86 *c)
printed = 1;
}
return;
-#endif
}
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 14baf78..c992254 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -587,12 +587,11 @@ static void amd_e400_idle(void)

void __cpuinit select_idle_routine(const struct cpuinfo_x86 *c)
{
-#ifdef CONFIG_SMP
if (pm_idle == poll_idle && smp_num_siblings > 1) {
printk_once(KERN_WARNING "WARNING: polling idle and HT enabled,"
" performance may degrade.\n");
}
-#endif
+
if (pm_idle)
return;

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 3a4908d..4c5a5e5 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -112,10 +112,6 @@ static struct task_struct *idle_thread_array[NR_CPUS] __cpuinitdata ;
#define set_idle_for_cpu(x, p) (idle_thread_array[(x)] = (p))
#endif

-/* Number of siblings per CPU package */
-int smp_num_siblings = 1;
-EXPORT_SYMBOL(smp_num_siblings);
-
/* Per CPU bogomips and other parameters */
DEFINE_PER_CPU_SHARED_ALIGNED(struct cpuinfo_x86, cpu_info);
EXPORT_PER_CPU_SYMBOL(cpu_info);
diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
index 26b8a85..346e7ac 100644
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -572,11 +572,6 @@ static int __init p4_init(char **cpu_type)
if (cpu_model > 6 || cpu_model == 5)
return 0;

-#ifndef CONFIG_SMP
- *cpu_type = "i386/p4";
- model = &op_p4_spec;
- return 1;
-#else
switch (smp_num_siblings) {
case 1:
*cpu_type = "i386/p4";
@@ -588,7 +583,6 @@ static int __init p4_init(char **cpu_type)
model = &op_p4_ht2_spec;
return 1;
}
-#endif

printk(KERN_INFO "oprofile: P4 HyperThreading detected with > 2 threads\n");
printk(KERN_INFO "oprofile: Reverting to timer mode.\n");
diff --git a/arch/x86/oprofile/op_model_p4.c b/arch/x86/oprofile/op_model_p4.c
index ae3503e..c6bcb22 100644
--- a/arch/x86/oprofile/op_model_p4.c
+++ b/arch/x86/oprofile/op_model_p4.c
@@ -42,21 +42,15 @@ static unsigned int num_controls = NUM_CONTROLS_NON_HT;
kernel boot-time. */
static inline void setup_num_counters(void)
{
-#ifdef CONFIG_SMP
if (smp_num_siblings == 2) {
num_counters = NUM_COUNTERS_HT2;
num_controls = NUM_CONTROLS_HT2;
}
-#endif
}

static inline int addr_increment(void)
{
-#ifdef CONFIG_SMP
return smp_num_siblings == 2 ? 2 : 1;
-#else
- return 1;
-#endif
}

--
1.7.9.1

2012-02-23 23:59:10

by Kevin Winchester

[permalink] [raw]

Subject: [PATCH v4 4/5] x86: Move per cpu cpu_core_map to a field in struct cpuinfo_x86

This simplifies the various code paths using this field as it
groups the per-cpu data together.

Acked-by: Borislav Petkov <[email protected]>
Signed-off-by: Kevin Winchester <[email protected]>
---
arch/x86/include/asm/processor.h | 5 +++++
arch/x86/include/asm/smp.h | 6 ------
arch/x86/include/asm/topology.h | 4 ++--
arch/x86/kernel/cpu/proc.c | 3 +--
arch/x86/kernel/smpboot.c | 35 ++++++++++++++---------------------
arch/x86/kernel/tsc_sync.c | 2 +-
arch/x86/xen/smp.c | 4 ----
drivers/cpufreq/acpi-cpufreq.c | 2 +-
drivers/cpufreq/powernow-k8.c | 13 +++----------
9 files changed, 27 insertions(+), 47 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index a3fce4e..35ab05b 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -115,6 +115,11 @@ struct cpuinfo_x86 {
u16 llc_id;
/* representing HT siblings of each logical CPU */
cpumask_t sibling_map;
+ /*
+ * representing all execution threads on a logical CPU, i.e. per
+ * physical socket
+ */
+ cpumask_t core_map;
} __attribute__((__aligned__(SMP_CACHE_BYTES)));

#define X86_VENDOR_INTEL 0
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index b5e7cd2..75aea4d 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -31,14 +31,8 @@ static inline bool cpu_has_ht_siblings(void)
return has_siblings;
}

-DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
DECLARE_PER_CPU(int, cpu_number);

-static inline struct cpumask *cpu_core_mask(int cpu)
-{
- return per_cpu(cpu_core_map, cpu);
-}
-
DECLARE_EARLY_PER_CPU(u16, x86_cpu_to_apicid);
DECLARE_EARLY_PER_CPU(u16, x86_bios_cpu_apicid);
#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86_32)
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 5297acbf..58438a1b 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -160,7 +160,7 @@ extern const struct cpumask *cpu_coregroup_mask(int cpu);
#ifdef ENABLE_TOPO_DEFINES
#define topology_physical_package_id(cpu) (cpu_data(cpu).phys_proc_id)
#define topology_core_id(cpu) (cpu_data(cpu).cpu_core_id)
-#define topology_core_cpumask(cpu) (per_cpu(cpu_core_map, cpu))
+#define topology_core_cpumask(cpu) (&cpu_data(cpu).core_map)
#define topology_thread_cpumask(cpu) (&cpu_data(cpu).sibling_map)

/* indicates that pointers to the topology cpumask_t maps are valid */
@@ -176,7 +176,7 @@ void x86_pci_root_bus_resources(int bus, struct list_head *resources);

#ifdef CONFIG_SMP
#define mc_capable() ((boot_cpu_data.x86_max_cores > 1) && \
- (cpumask_weight(cpu_core_mask(0)) != nr_cpu_ids))
+ (cpumask_weight(&boot_cpu_data.core_map) != nr_cpu_ids))
#define smt_capable() (smp_num_siblings > 1)
#endif

diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
index 8022c66..e6e07c2 100644
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -13,8 +13,7 @@ static void show_cpuinfo_core(struct seq_file *m, struct cpuinfo_x86 *c,
#ifdef CONFIG_SMP
if (c->x86_max_cores * smp_num_siblings > 1) {
seq_printf(m, "physical id\t: %d\n", c->phys_proc_id);
- seq_printf(m, "siblings\t: %d\n",
- cpumask_weight(cpu_core_mask(cpu)));
+ seq_printf(m, "siblings\t: %d\n", cpumask_weight(&c->core_map));
seq_printf(m, "core id\t\t: %d\n", c->cpu_core_id);
seq_printf(m, "cpu cores\t: %d\n", c->booted_cores);
seq_printf(m, "apicid\t\t: %d\n", c->apicid);
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 7e73ea7..3a4908d 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -116,10 +116,6 @@ static struct task_struct *idle_thread_array[NR_CPUS] __cpuinitdata ;
int smp_num_siblings = 1;
EXPORT_SYMBOL(smp_num_siblings);

-/* representing HT and core siblings of each logical CPU */
-DEFINE_PER_CPU(cpumask_var_t, cpu_core_map);
-EXPORT_PER_CPU_SYMBOL(cpu_core_map);
-
/* Per CPU bogomips and other parameters */
DEFINE_PER_CPU_SHARED_ALIGNED(struct cpuinfo_x86, cpu_info);
EXPORT_PER_CPU_SYMBOL(cpu_info);
@@ -326,8 +322,8 @@ static void __cpuinit link_thread_siblings(int cpu1, int cpu2)
{
cpumask_set_cpu(cpu1, &cpu_data(cpu2).sibling_map);
cpumask_set_cpu(cpu2, &cpu_data(cpu1).sibling_map);
- cpumask_set_cpu(cpu1, cpu_core_mask(cpu2));
- cpumask_set_cpu(cpu2, cpu_core_mask(cpu1));
+ cpumask_set_cpu(cpu1, &cpu_data(cpu2).core_map);
+ cpumask_set_cpu(cpu2, &cpu_data(cpu1).core_map);
cpumask_set_cpu(cpu1, &cpu_data(cpu2).llc_shared_map);
cpumask_set_cpu(cpu2, &cpu_data(cpu1).llc_shared_map);
}
@@ -361,7 +357,7 @@ void __cpuinit set_cpu_sibling_map(int cpu)
cpumask_set_cpu(cpu, &c->llc_shared_map);

if (__this_cpu_read(cpu_info.x86_max_cores) == 1) {
- cpumask_copy(cpu_core_mask(cpu), &c->sibling_map);
+ cpumask_copy(&c->core_map, &c->sibling_map);
c->booted_cores = 1;
return;
}
@@ -374,8 +370,8 @@ void __cpuinit set_cpu_sibling_map(int cpu)
cpumask_set_cpu(cpu, &o->llc_shared_map);
}
if (c->phys_proc_id == o->phys_proc_id) {
- cpumask_set_cpu(i, cpu_core_mask(cpu));
- cpumask_set_cpu(cpu, cpu_core_mask(i));
+ cpumask_set_cpu(i, &c->core_map);
+ cpumask_set_cpu(cpu, &o->core_map);
/*
* Does this new cpu bringup a new core?
*/
@@ -404,11 +400,11 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
struct cpuinfo_x86 *c = &cpu_data(cpu);
/*
* For perf, we return last level cache shared map.
- * And for power savings, we return cpu_core_map
+ * And for power savings, we return core map.
*/
if ((sched_mc_power_savings || sched_smt_power_savings) &&
!(cpu_has(c, X86_FEATURE_AMD_DCM)))
- return cpu_core_mask(cpu);
+ return &c->core_map;
else
return &c->llc_shared_map;
}
@@ -905,7 +901,7 @@ static __init void disable_smp(void)
else
physid_set_mask_of_physid(0, &phys_cpu_present_map);
cpumask_set_cpu(0, &cpu_data(0).sibling_map);
- cpumask_set_cpu(0, cpu_core_mask(0));
+ cpumask_set_cpu(0, &cpu_data(0).core_map);
}

/*
@@ -1028,8 +1024,6 @@ static void __init smp_cpu_index_default(void)
*/
void __init native_smp_prepare_cpus(unsigned int max_cpus)
{
- unsigned int i;
-
preempt_disable();
smp_cpu_index_default();

@@ -1041,9 +1035,6 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
mb();

current_thread_info()->cpu = 0; /* needed? */
- for_each_possible_cpu(i) {
- zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
- }
set_cpu_sibling_map(0);

@@ -1231,19 +1222,21 @@ static void remove_siblinginfo(int cpu)
int sibling;
struct cpuinfo_x86 *c = &cpu_data(cpu);

- for_each_cpu(sibling, cpu_core_mask(cpu)) {
- cpumask_clear_cpu(cpu, cpu_core_mask(sibling));
+ for_each_cpu(sibling, &c->core_map) {
+ struct cpuinfo_x86 *o = &cpu_data(sibling);
+
+ cpumask_clear_cpu(cpu, &o->core_map);
/*/
* last thread sibling in this cpu core going down
*/
if (cpumask_weight(&c->sibling_map) == 1)
- cpu_data(sibling).booted_cores--;
+ o->booted_cores--;
}

for_each_cpu(sibling, &c->sibling_map)
cpumask_clear_cpu(cpu, &c->sibling_map);
cpumask_clear(&c->sibling_map);
- cpumask_clear(cpu_core_mask(cpu));
+ cpumask_clear(&c->core_map);
c->phys_proc_id = 0;
c->cpu_core_id = 0;
cpumask_clear_cpu(cpu, cpu_sibling_setup_mask);
diff --git a/arch/x86/kernel/tsc_sync.c b/arch/x86/kernel/tsc_sync.c
index fc25e60..d16c5e3 100644
--- a/arch/x86/kernel/tsc_sync.c
+++ b/arch/x86/kernel/tsc_sync.c
@@ -114,7 +114,7 @@ static __cpuinit void check_tsc_warp(unsigned int timeout)
*/
static inline unsigned int loop_timeout(int cpu)
{
- return (cpumask_weight(cpu_core_mask(cpu)) > 1) ? 2 : 20;
+ return (cpumask_weight(&cpu_data(cpu).core_map) > 1) ? 2 : 20;
}

/*
diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 00f32c0..d1792ec 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -206,7 +206,6 @@ static void __init xen_smp_prepare_boot_cpu(void)
static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
{
unsigned cpu;
- unsigned int i;

if (skip_ioapic_setup) {
char *m = (max_cpus == 0) ?
@@ -222,9 +221,6 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
smp_store_cpu_info(0);
cpu_data(0).x86_max_cores = 1;

- for_each_possible_cpu(i) {
- zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
- }
set_cpu_sibling_map(0);

if (xen_smp_intr_init(0))
diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
index 56c6c6b..152af7f 100644
--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -557,7 +557,7 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy *policy)
dmi_check_system(sw_any_bug_dmi_table);
if (bios_with_sw_any_bug && cpumask_weight(policy->cpus) == 1) {
policy->shared_type = CPUFREQ_SHARED_TYPE_ALL;
- cpumask_copy(policy->cpus, cpu_core_mask(cpu));
+ cpumask_copy(policy->cpus, &c->core_map);
}
#endif

diff --git a/drivers/cpufreq/powernow-k8.c b/drivers/cpufreq/powernow-k8.c
index 8f9b2ce..da0767c 100644
--- a/drivers/cpufreq/powernow-k8.c
+++ b/drivers/cpufreq/powernow-k8.c
@@ -66,13 +66,6 @@ static struct msr __percpu *msrs;

static struct cpufreq_driver cpufreq_amd64_driver;

-#ifndef CONFIG_SMP
-static inline const struct cpumask *cpu_core_mask(int cpu)
-{
- return cpumask_of(0);
-}
-#endif
-
/* Return a frequency in MHz, given an input fid */
static u32 find_freq_from_fid(u32 fid)
{
@@ -715,7 +708,7 @@ static int fill_powernow_table(struct powernow_k8_data *data,

pr_debug("cfid 0x%x, cvid 0x%x\n", data->currfid, data->currvid);
data->powernow_table = powernow_table;
- if (cpumask_first(cpu_core_mask(data->cpu)) == data->cpu)
+ if (cpumask_first(&cpu_data(data->cpu).core_map) == data->cpu)
print_basics(data);

for (j = 0; j < data->numps; j++)
@@ -884,7 +877,7 @@ static int powernow_k8_cpu_init_acpi(struct powernow_k8_data *data)
powernow_table[data->acpi_data.state_count].index = 0;
data->powernow_table = powernow_table;

- if (cpumask_first(cpu_core_mask(data->cpu)) == data->cpu)
+ if (cpumask_first(&cpu_data(data->cpu).core_map) == data->cpu)
print_basics(data);

/* notify BIOS that we exist */
@@ -1326,7 +1319,7 @@ static int __cpuinit powernowk8_cpu_init(struct cpufreq_policy *pol)
if (cpu_family == CPU_HW_PSTATE)
cpumask_copy(pol->cpus, cpumask_of(pol->cpu));
else
- cpumask_copy(pol->cpus, cpu_core_mask(pol->cpu));
+ cpumask_copy(pol->cpus, &c->core_map);
data->available_cores = pol->cpus;

if (cpu_family == CPU_HW_PSTATE)
--
1.7.9.1

2012-02-23 23:59:42

by Kevin Winchester

[permalink] [raw]

Subject: [PATCH v4 2/5] x86: Move per cpu cpu_llc_id to a field in struct cpuinfo_x86

2012-02-24 11:47:46

by Borislav Petkov

[permalink] [raw]

Subject: Re: [PATCH v4 0/5] x86: Cleanup and simplify cpu-specific data

On Thu, Feb 23, 2012 at 07:57:51PM -0400, Kevin Winchester wrote:
> Various per-cpu fields are define in arch/x86/kernel/smpboot.c that are
> basically equivalent to the cpu-specific data in struct cpuinfo_x86.
> By moving these fields into the structure, a number of codepaths can be
> simplified since they no longer need to care about those fields not
> existing on !SMP builds.
>
> The size effects on allno (UP) and allyes (MAX_SMP) kernels are as
> follows:
>
> text data bss dec hex filename
> 1586721 304864 506208 2397793 249661 vmlinux.allno
> 1588517 304928 505920 2399365 249c85 vmlinux.allno.after
> 84706053 13212311 42434560 140352924 85d9d9c vmlinux.allyes
> 84705333 13213799 42434560 140353692 85da09c vmlinux.allyes.afte
>
> As can be seen, the kernels get slighly larger, but the code reduction/
> simplification should be enough to compensate for it.

Just a hint for the future: when you're sending multiple versions of
a patchset, it would be really helpful to have changelog in the 0/n
message so that the reviewer can know what happened in each version.
I.e.,

v4:
Rediff changes against -rc4

v3:
Small cleanups, integrate comments.

etc.

Otherwise, we have to go look at the older patches and compare what
changed.

HTH.

--
Regards/Gruss,
Boris.

2012-02-24 12:22:09

by Kevin Winchester

[permalink] [raw]

Subject: Re: [PATCH v4 0/5] x86: Cleanup and simplify cpu-specific data

On 24 February 2012 07:47, Borislav Petkov <[email protected]> wrote:
>
> Just a hint for the future: when you're sending multiple versions of
> a patchset, it would be really helpful to have changelog in the 0/n
> message so that the reviewer can know what happened in each version.
> I.e.,
>
> v4:
> ? ? ? ?Rediff changes against -rc4
>
> v3:
> ? ? ? ?Small cleanups, integrate comments.
>
> etc.
>
> Otherwise, we have to go look at the older patches and compare what
> changed.
>

Yes, of course, I'm sorry. I've been reading LKML for years...You
would think that when I finally get the chance to contribute I would
have learned something by this time.

--
Kevin Winchester

2012-02-24 12:30:48

by Borislav Petkov

[permalink] [raw]

Subject: Re: [PATCH v4 0/5] x86: Cleanup and simplify cpu-specific data

On Fri, Feb 24, 2012 at 08:22:05AM -0400, Kevin Winchester wrote:
> Yes, of course, I'm sorry. I've been reading LKML for years...You
> would think that when I finally get the chance to contribute I would
> have learned something by this time.

Nah, no worries, you'll get the hang of it with time. Keep up the good
work! :-)

Thanks.

--
Regards/Gruss,
Boris.

2012-02-27 11:59:32

by Ingo Molnar

[permalink] [raw]

Subject: Re: [PATCH v4 0/5] x86: Cleanup and simplify cpu-specific data

* Kevin Winchester <[email protected]> wrote:

> Various per-cpu fields are define in arch/x86/kernel/smpboot.c
> that are basically equivalent to the cpu-specific data in
> struct cpuinfo_x86. By moving these fields into the structure,
> a number of codepaths can be simplified since they no longer
> need to care about those fields not existing on !SMP builds.

Works mostly fine, except with the attached 32-bit UP !APIC
config I get various build failures (resolved via the patch
below) and a link failure (not resolved):

make[1]: Nothing to be done for `all'.
arch/x86/built-in.o:vdso32-setup.c:function detect_extended_topology: error: undefined reference to 'apic'
arch/x86/built-in.o:vdso32-setup.c:function detect_extended_topology: error: undefined reference to 'apic'
arch/x86/built-in.o:vdso32-setup.c:function detect_extended_topology: error: undefined reference to 'apic'
arch/x86/built-in.o:vdso32-setup.c:function detect_ht: error: undefined reference to 'apic'
arch/x86/built-in.o:vdso32-setup.c:function x86_msi: error: undefined reference to 'native_setup_msi_irqs'
arch/x86/built-in.o:vdso32-setup.c:function x86_msi: error: undefined reference to 'native_teardown_msi_irq'
make: *** [.tmp_vmlinux1] Error 1

Thanks,

Ingo

------------>

arch/x86/kernel/cpu/amd.c | 1 +
arch/x86/kernel/cpu/topology.c | 1 +
arch/x86/kernel/process.c | 1 +
3 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index c593eac..84bf176 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -8,6 +8,7 @@
#include <linux/sched.h>
#include <asm/processor.h>
#include <asm/apic.h>
+#include <asm/smp.h>
#include <asm/cpu.h>
#include <asm/pci-direct.h>

diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 4397e98..c53440c 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -6,6 +6,7 @@

#include <linux/cpu.h>
#include <asm/apic.h>
+#include <asm/smp.h>
#include <asm/pat.h>
#include <asm/processor.h>

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 14baf78..3dd6015 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -18,6 +18,7 @@
#include <asm/system.h>
#include <asm/apic.h>
#include <asm/syscalls.h>
+#include <asm/smp.h>
#include <asm/idle.h>
#include <asm/uaccess.h>
#include <asm/i387.h>

Attachments:

(No filename) (2.39 kB)
config (72.49 kB)
Download all attachments

2012-02-28 00:52:58

by Kevin Winchester

[permalink] [raw]

Subject: Re: [PATCH v4 0/5] x86: Cleanup and simplify cpu-specific data

On 27 February 2012 07:59, Ingo Molnar <[email protected]> wrote:
>
> * Kevin Winchester <[email protected]> wrote:
>
>> Various per-cpu fields are define in arch/x86/kernel/smpboot.c
>> that are basically equivalent to the cpu-specific data in
>> struct cpuinfo_x86. By moving these fields into the structure,
>> a number of codepaths can be simplified since they no longer
>> need to care about those fields not existing on !SMP builds.
>
> Works mostly fine, except with the attached 32-bit UP !APIC
> config I get various build failures (resolved via the patch
> below) and a link failure (not resolved):
>

I get the following failure before I get to link time:

In file included from
/home/kevin/linux/linux-2.6/arch/x86/include/asm/uaccess.h:573:0,
from
/home/kevin/linux/linux-2.6/arch/x86/include/asm/sections.h:5,
from
/home/kevin/linux/linux-2.6/arch/x86/include/asm/hw_irq.h:26,
from include/linux/irq.h:357,
from
/home/kevin/linux/linux-2.6/arch/x86/include/asm/hardirq.h:5,
from include/linux/hardirq.h:7,
from include/linux/interrupt.h:12,
from net/core/pktgen.c:135:
In function ?copy_from_user?,
inlined from ?pktgen_if_write? at net/core/pktgen.c:877:20:
/home/kevin/linux/linux-2.6/arch/x86/include/asm/uaccess_32.h:211:26:
error: call to ?copy_from_user_overflow? declared with attribute
error: copy_from_user() buffer size is not provably correct
make[2]: *** [net/core/pktgen.o] Error 1

On:

gcc (GCC) 4.6.2 20120120 (prerelease)

Is that my fault, or something else?

Kevin

2012-02-28 03:44:00

by H. Peter Anvin

[permalink] [raw]

Subject: Re: [PATCH v4 0/5] x86: Cleanup and simplify cpu-specific data

On 02/27/2012 04:52 PM, Kevin Winchester wrote:
> On 27 February 2012 07:59, Ingo Molnar<[email protected]> wrote:
>>
>> * Kevin Winchester<[email protected]> wrote:
>>
>>> Various per-cpu fields are define in arch/x86/kernel/smpboot.c
>>> that are basically equivalent to the cpu-specific data in
>>> struct cpuinfo_x86. By moving these fields into the structure,
>>> a number of codepaths can be simplified since they no longer
>>> need to care about those fields not existing on !SMP builds.
>>
>> Works mostly fine, except with the attached 32-bit UP !APIC
>> config I get various build failures (resolved via the patch
>> below) and a link failure (not resolved):
>>
>
> I get the following failure before I get to link time:
>
> In file included from
> /home/kevin/linux/linux-2.6/arch/x86/include/asm/uaccess.h:573:0,
> from
> /home/kevin/linux/linux-2.6/arch/x86/include/asm/sections.h:5,
> from
> /home/kevin/linux/linux-2.6/arch/x86/include/asm/hw_irq.h:26,
> from include/linux/irq.h:357,
> from
> /home/kevin/linux/linux-2.6/arch/x86/include/asm/hardirq.h:5,
> from include/linux/hardirq.h:7,
> from include/linux/interrupt.h:12,
> from net/core/pktgen.c:135:
> In function ?copy_from_user?,
> inlined from ?pktgen_if_write? at net/core/pktgen.c:877:20:
> /home/kevin/linux/linux-2.6/arch/x86/include/asm/uaccess_32.h:211:26:
> error: call to ?copy_from_user_overflow? declared with attribute
> error: copy_from_user() buffer size is not provably correct
> make[2]: *** [net/core/pktgen.o] Error 1
>
> On:
>
> gcc (GCC) 4.6.2 20120120 (prerelease)
>
> Is that my fault, or something else?
>
> Kevin
>

That comes from compiling with warnings as errors. Not that someone
shouldn't look at that kind of problem.

-hpa

2012-02-28 08:24:44

by Ingo Molnar

[permalink] [raw]

Subject: Re: [PATCH v4 0/5] x86: Cleanup and simplify cpu-specific data

* H. Peter Anvin <[email protected]> wrote:

> >In function ‘copy_from_user’,
> > inlined from ‘pktgen_if_write’ at net/core/pktgen.c:877:20:
> >/home/kevin/linux/linux-2.6/arch/x86/include/asm/uaccess_32.h:211:26:
> >error: call to ‘copy_from_user_overflow’ declared with attribute
> >error: copy_from_user() buffer size is not provably correct
> >make[2]: *** [net/core/pktgen.o] Error 1
> >
> >On:
> >
> >gcc (GCC) 4.6.2 20120120 (prerelease)
> >
> >Is that my fault, or something else?
> >
> >Kevin
> >
>
> That comes from compiling with warnings as errors. Not that someone
> shouldn't look at that kind of problem.

Can probably be worked around by disabling:

CONFIG_DEBUG_STRICT_USER_COPY_CHECKS

Thanks,

Ingo

2012-02-28 08:32:05

by H. Peter Anvin

[permalink] [raw]

Subject: Re: [PATCH v4 0/5] x86: Cleanup and simplify cpu-specific data

Better yet, fix the problem...

Ingo Molnar <[email protected]> wrote:

>
>* H. Peter Anvin <[email protected]> wrote:
>
>> >In function ‘copy_from_user’,
>> > inlined from ‘pktgen_if_write’ at net/core/pktgen.c:877:20:
>>
>>/home/kevin/linux/linux-2.6/arch/x86/include/asm/uaccess_32.h:211:26:
>> >error: call to ‘copy_from_user_overflow’ declared with attribute
>> >error: copy_from_user() buffer size is not provably correct
>> >make[2]: *** [net/core/pktgen.o] Error 1
>> >
>> >On:
>> >
>> >gcc (GCC) 4.6.2 20120120 (prerelease)
>> >
>> >Is that my fault, or something else?
>> >
>> >Kevin
>> >
>>
>> That comes from compiling with warnings as errors. Not that someone
>> shouldn't look at that kind of problem.
>
>Can probably be worked around by disabling:
>
>CONFIG_DEBUG_STRICT_USER_COPY_CHECKS
>
>Thanks,
>
> Ingo

--
Sent from my mobile phone. Please excuse my brevity and lack of formatting.