2015-11-10 11:48:57

by Borislav Petkov

[permalink] [raw]
Subject: [RFC PATCH 0/3] x86/cpufeature: Cleanup stuff

From: Borislav Petkov <[email protected]>

Hi all,

so this should take care of cleaning up some aspects of our cpufeatures
handling.

Patches should be pretty self-explanatory but let me send them out as
an RFC - I might've missed something obvious of the sort "but but, you
can't do that..."

Thanks.

Borislav Petkov (3):
x86/cpufeature: Move some of the scattered feature bits to
x86_capability
x86/cpufeature: Cleanup get_cpu_cap()
x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros

arch/x86/crypto/chacha20_glue.c | 2 +-
arch/x86/crypto/crc32c-intel_glue.c | 3 +-
arch/x86/include/asm/cmpxchg_32.h | 2 +-
arch/x86/include/asm/cpufeature.h | 106 +++++++++++++++-------------
arch/x86/include/asm/smp.h | 2 +-
arch/x86/kernel/cpu/amd.c | 2 +-
arch/x86/kernel/cpu/centaur.c | 2 +-
arch/x86/kernel/cpu/common.c | 48 +++++++------
arch/x86/kernel/cpu/intel.c | 3 +-
arch/x86/kernel/cpu/mtrr/generic.c | 2 +-
arch/x86/kernel/cpu/mtrr/main.c | 2 +-
arch/x86/kernel/cpu/perf_event_amd.c | 4 +-
arch/x86/kernel/cpu/perf_event_amd_uncore.c | 8 +--
arch/x86/kernel/cpu/scattered.c | 20 ------
arch/x86/kernel/cpu/transmeta.c | 4 +-
arch/x86/kernel/fpu/init.c | 4 +-
arch/x86/kernel/hw_breakpoint.c | 3 +-
arch/x86/kernel/vm86_32.c | 4 +-
arch/x86/mm/setup_nx.c | 4 +-
drivers/char/hw_random/via-rng.c | 5 +-
drivers/crypto/padlock-aes.c | 2 +-
drivers/crypto/padlock-sha.c | 3 +-
fs/btrfs/disk-io.c | 2 +-
23 files changed, 115 insertions(+), 122 deletions(-)

--
2.3.5


2015-11-10 11:49:46

by Borislav Petkov

[permalink] [raw]
Subject: [RFC PATCH 1/3] x86/cpufeature: Move some of the scattered feature bits to x86_capability

From: Borislav Petkov <[email protected]>

Turn the CPUID leafs which are proper feature bit leafs into separate
->x86_capability words.

Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/include/asm/cpufeature.h | 54 +++++++++++++++++++++++----------------
arch/x86/kernel/cpu/common.c | 5 ++++
arch/x86/kernel/cpu/scattered.c | 20 ---------------
3 files changed, 37 insertions(+), 42 deletions(-)

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index e4f8010f22e0..13d78e0e6ae0 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -12,7 +12,7 @@
#include <asm/disabled-features.h>
#endif

-#define NCAPINTS 14 /* N 32-bit words worth of info */
+#define NCAPINTS 16 /* N 32-bit words worth of info */
#define NBUGINTS 1 /* N 32-bit bug flags */

/*
@@ -181,22 +181,17 @@

/*
* Auxiliary flags: Linux defined - For features scattered in various
- * CPUID levels like 0x6, 0xA etc, word 7
+ * CPUID levels like 0x6, 0xA etc, word 7.
+ *
+ * Reuse free bits when adding new feature flags!
*/
-#define X86_FEATURE_IDA ( 7*32+ 0) /* Intel Dynamic Acceleration */
-#define X86_FEATURE_ARAT ( 7*32+ 1) /* Always Running APIC Timer */
+
#define X86_FEATURE_CPB ( 7*32+ 2) /* AMD Core Performance Boost */
#define X86_FEATURE_EPB ( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
-#define X86_FEATURE_PLN ( 7*32+ 5) /* Intel Power Limit Notification */
-#define X86_FEATURE_PTS ( 7*32+ 6) /* Intel Package Thermal Status */
-#define X86_FEATURE_DTHERM ( 7*32+ 7) /* Digital Thermal Sensor */
+
#define X86_FEATURE_HW_PSTATE ( 7*32+ 8) /* AMD HW-PState */
#define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */
-#define X86_FEATURE_HWP ( 7*32+ 10) /* "hwp" Intel HWP */
-#define X86_FEATURE_HWP_NOTIFY ( 7*32+ 11) /* Intel HWP_NOTIFY */
-#define X86_FEATURE_HWP_ACT_WINDOW ( 7*32+ 12) /* Intel HWP_ACT_WINDOW */
-#define X86_FEATURE_HWP_EPP ( 7*32+13) /* Intel HWP_EPP */
-#define X86_FEATURE_HWP_PKG_REQ ( 7*32+14) /* Intel HWP_PKG_REQ */
+
#define X86_FEATURE_INTEL_PT ( 7*32+15) /* Intel Processor Trace */

/* Virtualization flags: Linux defined, word 8 */
@@ -205,16 +200,7 @@
#define X86_FEATURE_FLEXPRIORITY ( 8*32+ 2) /* Intel FlexPriority */
#define X86_FEATURE_EPT ( 8*32+ 3) /* Intel Extended Page Table */
#define X86_FEATURE_VPID ( 8*32+ 4) /* Intel Virtual Processor ID */
-#define X86_FEATURE_NPT ( 8*32+ 5) /* AMD Nested Page Table support */
-#define X86_FEATURE_LBRV ( 8*32+ 6) /* AMD LBR Virtualization support */
-#define X86_FEATURE_SVML ( 8*32+ 7) /* "svm_lock" AMD SVM locking MSR */
-#define X86_FEATURE_NRIPS ( 8*32+ 8) /* "nrip_save" AMD SVM next_rip save */
-#define X86_FEATURE_TSCRATEMSR ( 8*32+ 9) /* "tsc_scale" AMD TSC scaling support */
-#define X86_FEATURE_VMCBCLEAN ( 8*32+10) /* "vmcb_clean" AMD VMCB clean bits support */
-#define X86_FEATURE_FLUSHBYASID ( 8*32+11) /* AMD flush-by-ASID support */
-#define X86_FEATURE_DECODEASSISTS ( 8*32+12) /* AMD Decode Assists support */
-#define X86_FEATURE_PAUSEFILTER ( 8*32+13) /* AMD filtered pause intercept */
-#define X86_FEATURE_PFTHRESHOLD ( 8*32+14) /* AMD pause filter threshold */
+
#define X86_FEATURE_VMMCALL ( 8*32+15) /* Prefer vmmcall to vmcall */


@@ -258,6 +244,30 @@
/* AMD-defined CPU features, CPUID level 0x80000008 (ebx), word 13 */
#define X86_FEATURE_CLZERO (13*32+0) /* CLZERO instruction */

+/* Thermal and Power Management Leaf, CPUID level 0x00000006 (eax), word 14 */
+#define X86_FEATURE_DTHERM (14*32+ 0) /* Digital Thermal Sensor */
+#define X86_FEATURE_IDA (14*32+ 1) /* Intel Dynamic Acceleration */
+#define X86_FEATURE_ARAT (14*32+ 2) /* Always Running APIC Timer */
+#define X86_FEATURE_PLN (14*32+ 4) /* Intel Power Limit Notification */
+#define X86_FEATURE_PTS (14*32+ 6) /* Intel Package Thermal Status */
+#define X86_FEATURE_HWP (14*32+ 7) /* Intel Hardware P-states */
+#define X86_FEATURE_HWP_NOTIFY (14*32+ 8) /* HWP Notification */
+#define X86_FEATURE_HWP_ACT_WINDOW (14*32+ 9) /* HWP Activity Window */
+#define X86_FEATURE_HWP_EPP (14*32+10) /* HWP Energy Perf. Preference */
+#define X86_FEATURE_HWP_PKG_REQ (14*32+11) /* HWP Package Level Request */
+
+/* AMD SVM Feature Identification, CPUID level 0x8000000a (edx), word 15 */
+#define X86_FEATURE_NPT (15*32+ 0) /* Nested Page Table support */
+#define X86_FEATURE_LBRV (15*32+ 1) /* LBR Virtualization support */
+#define X86_FEATURE_SVML (15*32+ 2) /* "svm_lock" SVM locking MSR */
+#define X86_FEATURE_NRIPS (15*32+ 3) /* "nrip_save" SVM next_rip save */
+#define X86_FEATURE_TSCRATEMSR (15*32+ 4) /* "tsc_scale" TSC scaling support */
+#define X86_FEATURE_VMCBCLEAN (15*32+ 5) /* "vmcb_clean" VMCB clean bits support */
+#define X86_FEATURE_FLUSHBYASID (15*32+ 6) /* flush-by-ASID support */
+#define X86_FEATURE_DECODEASSISTS (15*32+ 7) /* Decode Assists support */
+#define X86_FEATURE_PAUSEFILTER (15*32+10) /* filtered pause intercept */
+#define X86_FEATURE_PFTHRESHOLD (15*32+12) /* pause filter threshold */
+
/*
* BUG word(s)
*/
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 4ddd780aeac9..3e7ed149082b 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -619,6 +619,8 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
cpuid_count(0x00000007, 0, &eax, &ebx, &ecx, &edx);

c->x86_capability[9] = ebx;
+
+ c->x86_capability[14] = cpuid_eax(0x00000006);
}

/* Extended state features: level 0x0000000d */
@@ -680,6 +682,9 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
if (c->extended_cpuid_level >= 0x80000007)
c->x86_power = cpuid_edx(0x80000007);

+ if (c->extended_cpuid_level >= 0x8000000a)
+ c->x86_capability[15] = cpuid_edx(0x8000000a);
+
init_scattered_cpuid_features(c);
}

diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index 608fb26c7254..8cb57df9398d 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -31,32 +31,12 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c)
const struct cpuid_bit *cb;

static const struct cpuid_bit cpuid_bits[] = {
- { X86_FEATURE_DTHERM, CR_EAX, 0, 0x00000006, 0 },
- { X86_FEATURE_IDA, CR_EAX, 1, 0x00000006, 0 },
- { X86_FEATURE_ARAT, CR_EAX, 2, 0x00000006, 0 },
- { X86_FEATURE_PLN, CR_EAX, 4, 0x00000006, 0 },
- { X86_FEATURE_PTS, CR_EAX, 6, 0x00000006, 0 },
- { X86_FEATURE_HWP, CR_EAX, 7, 0x00000006, 0 },
- { X86_FEATURE_HWP_NOTIFY, CR_EAX, 8, 0x00000006, 0 },
- { X86_FEATURE_HWP_ACT_WINDOW, CR_EAX, 9, 0x00000006, 0 },
- { X86_FEATURE_HWP_EPP, CR_EAX,10, 0x00000006, 0 },
- { X86_FEATURE_HWP_PKG_REQ, CR_EAX,11, 0x00000006, 0 },
{ X86_FEATURE_INTEL_PT, CR_EBX,25, 0x00000007, 0 },
{ X86_FEATURE_APERFMPERF, CR_ECX, 0, 0x00000006, 0 },
{ X86_FEATURE_EPB, CR_ECX, 3, 0x00000006, 0 },
{ X86_FEATURE_HW_PSTATE, CR_EDX, 7, 0x80000007, 0 },
{ X86_FEATURE_CPB, CR_EDX, 9, 0x80000007, 0 },
{ X86_FEATURE_PROC_FEEDBACK, CR_EDX,11, 0x80000007, 0 },
- { X86_FEATURE_NPT, CR_EDX, 0, 0x8000000a, 0 },
- { X86_FEATURE_LBRV, CR_EDX, 1, 0x8000000a, 0 },
- { X86_FEATURE_SVML, CR_EDX, 2, 0x8000000a, 0 },
- { X86_FEATURE_NRIPS, CR_EDX, 3, 0x8000000a, 0 },
- { X86_FEATURE_TSCRATEMSR, CR_EDX, 4, 0x8000000a, 0 },
- { X86_FEATURE_VMCBCLEAN, CR_EDX, 5, 0x8000000a, 0 },
- { X86_FEATURE_FLUSHBYASID, CR_EDX, 6, 0x8000000a, 0 },
- { X86_FEATURE_DECODEASSISTS, CR_EDX, 7, 0x8000000a, 0 },
- { X86_FEATURE_PAUSEFILTER, CR_EDX,10, 0x8000000a, 0 },
- { X86_FEATURE_PFTHRESHOLD, CR_EDX,12, 0x8000000a, 0 },
{ 0, 0, 0, 0, 0 }
};

--
2.3.5

2015-11-10 11:49:47

by Borislav Petkov

[permalink] [raw]
Subject: [RFC PATCH 2/3] x86/cpufeature: Cleanup get_cpu_cap()

From: Borislav Petkov <[email protected]>

Add an enum for the ->x86_capability array indices and cleanup
get_cpu_cap() by killing some redundant local vars.

Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/include/asm/cpufeature.h | 20 +++++++++++++++++
arch/x86/kernel/cpu/centaur.c | 2 +-
arch/x86/kernel/cpu/common.c | 47 ++++++++++++++++++---------------------
arch/x86/kernel/cpu/transmeta.c | 4 ++--
4 files changed, 45 insertions(+), 28 deletions(-)

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 13d78e0e6ae0..35401fef0d75 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -288,6 +288,26 @@
#include <asm/asm.h>
#include <linux/bitops.h>

+enum cpuid_leafs
+{
+ CPUID_1_EDX = 0,
+ CPUID_8000_0001_EDX,
+ CPUID_8086_0001_EDX,
+ CPUID_LNX_1,
+ CPUID_1_ECX,
+ CPUID_C000_0001_EDX,
+ CPUID_8000_0001_ECX,
+ CPUID_LNX_2,
+ CPUID_LNX_3,
+ CPUID_7_0_EBX,
+ CPUID_D_1_EAX,
+ CPUID_F_0_EDX,
+ CPUID_F_1_EDX,
+ CPUID_8000_0008_EBX,
+ CPUID_6_EAX,
+ CPUID_8000_000A_EDX,
+};
+
#ifdef CONFIG_X86_FEATURE_NAMES
extern const char * const x86_cap_flags[NCAPINTS*32];
extern const char * const x86_power_flags[32];
diff --git a/arch/x86/kernel/cpu/centaur.c b/arch/x86/kernel/cpu/centaur.c
index d8fba5c15fbd..ae20be6e483c 100644
--- a/arch/x86/kernel/cpu/centaur.c
+++ b/arch/x86/kernel/cpu/centaur.c
@@ -43,7 +43,7 @@ static void init_c3(struct cpuinfo_x86 *c)
/* store Centaur Extended Feature Flags as
* word 5 of the CPU capability bit array
*/
- c->x86_capability[5] = cpuid_edx(0xC0000001);
+ c->x86_capability[CPUID_C000_0001_EDX] = cpuid_edx(0xC0000001);
}
#ifdef CONFIG_X86_32
/* Cyrix III family needs CX8 & PGE explicitly enabled. */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 3e7ed149082b..6b6a74ddd4fc 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -600,52 +600,47 @@ void cpu_detect(struct cpuinfo_x86 *c)

void get_cpu_cap(struct cpuinfo_x86 *c)
{
- u32 tfms, xlvl;
- u32 ebx;
+ u32 eax, ebx, ecx, edx;

/* Intel-defined flags: level 0x00000001 */
if (c->cpuid_level >= 0x00000001) {
- u32 capability, excap;
+ cpuid(0x00000001, &eax, &ebx, &ecx, &edx);

- cpuid(0x00000001, &tfms, &ebx, &excap, &capability);
- c->x86_capability[0] = capability;
- c->x86_capability[4] = excap;
+ c->x86_capability[CPUID_1_ECX] = ecx;
+ c->x86_capability[CPUID_1_EDX] = edx;
}

/* Additional Intel-defined flags: level 0x00000007 */
if (c->cpuid_level >= 0x00000007) {
- u32 eax, ebx, ecx, edx;
-
cpuid_count(0x00000007, 0, &eax, &ebx, &ecx, &edx);

- c->x86_capability[9] = ebx;
+ c->x86_capability[CPUID_7_0_EBX] = ebx;

- c->x86_capability[14] = cpuid_eax(0x00000006);
+ c->x86_capability[CPUID_6_EAX] = cpuid_eax(0x00000006);
}

/* Extended state features: level 0x0000000d */
if (c->cpuid_level >= 0x0000000d) {
- u32 eax, ebx, ecx, edx;
-
cpuid_count(0x0000000d, 1, &eax, &ebx, &ecx, &edx);

- c->x86_capability[10] = eax;
+ c->x86_capability[CPUID_D_1_EAX] = eax;
}

/* Additional Intel-defined flags: level 0x0000000F */
if (c->cpuid_level >= 0x0000000F) {
- u32 eax, ebx, ecx, edx;

/* QoS sub-leaf, EAX=0Fh, ECX=0 */
cpuid_count(0x0000000F, 0, &eax, &ebx, &ecx, &edx);
- c->x86_capability[11] = edx;
+ c->x86_capability[CPUID_F_0_EDX] = edx;
+
if (cpu_has(c, X86_FEATURE_CQM_LLC)) {
/* will be overridden if occupancy monitoring exists */
c->x86_cache_max_rmid = ebx;

/* QoS sub-leaf, EAX=0Fh, ECX=1 */
cpuid_count(0x0000000F, 1, &eax, &ebx, &ecx, &edx);
- c->x86_capability[12] = edx;
+ c->x86_capability[CPUID_F_1_EDX] = edx;
+
if (cpu_has(c, X86_FEATURE_CQM_OCCUP_LLC)) {
c->x86_cache_max_rmid = ecx;
c->x86_cache_occ_scale = ebx;
@@ -657,22 +652,24 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
}

/* AMD-defined flags: level 0x80000001 */
- xlvl = cpuid_eax(0x80000000);
- c->extended_cpuid_level = xlvl;
+ eax = cpuid_eax(0x80000000);
+ c->extended_cpuid_level = eax;
+
+ if ((eax & 0xffff0000) == 0x80000000) {
+ if (eax >= 0x80000001) {
+ cpuid(0x80000001, &eax, &ebx, &ecx, &edx);

- if ((xlvl & 0xffff0000) == 0x80000000) {
- if (xlvl >= 0x80000001) {
- c->x86_capability[1] = cpuid_edx(0x80000001);
- c->x86_capability[6] = cpuid_ecx(0x80000001);
+ c->x86_capability[CPUID_8000_0001_ECX] = ecx;
+ c->x86_capability[CPUID_8000_0001_EDX] = edx;
}
}

if (c->extended_cpuid_level >= 0x80000008) {
- u32 eax = cpuid_eax(0x80000008);
+ cpuid(0x80000008, &eax, &ebx, &ecx, &edx);

c->x86_virt_bits = (eax >> 8) & 0xff;
c->x86_phys_bits = eax & 0xff;
- c->x86_capability[13] = cpuid_ebx(0x80000008);
+ c->x86_capability[CPUID_8000_0008_EBX] = ebx;
}
#ifdef CONFIG_X86_32
else if (cpu_has(c, X86_FEATURE_PAE) || cpu_has(c, X86_FEATURE_PSE36))
@@ -683,7 +680,7 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
c->x86_power = cpuid_edx(0x80000007);

if (c->extended_cpuid_level >= 0x8000000a)
- c->x86_capability[15] = cpuid_edx(0x8000000a);
+ c->x86_capability[CPUID_8000_000A_EDX] = cpuid_edx(0x8000000a);

init_scattered_cpuid_features(c);
}
diff --git a/arch/x86/kernel/cpu/transmeta.c b/arch/x86/kernel/cpu/transmeta.c
index 3fa0e5ad86b4..252da7aceca6 100644
--- a/arch/x86/kernel/cpu/transmeta.c
+++ b/arch/x86/kernel/cpu/transmeta.c
@@ -12,7 +12,7 @@ static void early_init_transmeta(struct cpuinfo_x86 *c)
xlvl = cpuid_eax(0x80860000);
if ((xlvl & 0xffff0000) == 0x80860000) {
if (xlvl >= 0x80860001)
- c->x86_capability[2] = cpuid_edx(0x80860001);
+ c->x86_capability[CPUID_8086_0001_EDX] = cpuid_edx(0x80860001);
}
}

@@ -82,7 +82,7 @@ static void init_transmeta(struct cpuinfo_x86 *c)
/* Unhide possibly hidden capability flags */
rdmsr(0x80860004, cap_mask, uk);
wrmsr(0x80860004, ~0, uk);
- c->x86_capability[0] = cpuid_edx(0x00000001);
+ c->x86_capability[CPUID_1_EDX] = cpuid_edx(0x00000001);
wrmsr(0x80860004, cap_mask, uk);

/* All Transmeta CPUs have a constant TSC */
--
2.3.5

2015-11-10 11:49:01

by Borislav Petkov

[permalink] [raw]
Subject: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros

From: Borislav Petkov <[email protected]>

Those are stupid and code should use static_cpu_has_safe() anyway. Kill
the least used and unused ones.

Signed-off-by: Borislav Petkov <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Matt Mackall <[email protected]>
Cc: Chris Mason <[email protected]>
Cc: Josef Bacik <[email protected]>
Cc: David Sterba <[email protected]>
---
arch/x86/crypto/chacha20_glue.c | 2 +-
arch/x86/crypto/crc32c-intel_glue.c | 3 ++-
arch/x86/include/asm/cmpxchg_32.h | 2 +-
arch/x86/include/asm/cpufeature.h | 32 +++--------------------------
arch/x86/include/asm/smp.h | 2 +-
arch/x86/kernel/cpu/amd.c | 2 +-
arch/x86/kernel/cpu/intel.c | 3 ++-
arch/x86/kernel/cpu/mtrr/generic.c | 2 +-
arch/x86/kernel/cpu/mtrr/main.c | 2 +-
arch/x86/kernel/cpu/perf_event_amd.c | 4 ++--
arch/x86/kernel/cpu/perf_event_amd_uncore.c | 8 ++++----
arch/x86/kernel/fpu/init.c | 4 ++--
arch/x86/kernel/hw_breakpoint.c | 3 ++-
arch/x86/kernel/vm86_32.c | 4 +++-
arch/x86/mm/setup_nx.c | 4 ++--
drivers/char/hw_random/via-rng.c | 5 +++--
drivers/crypto/padlock-aes.c | 2 +-
drivers/crypto/padlock-sha.c | 3 ++-
fs/btrfs/disk-io.c | 2 +-
19 files changed, 35 insertions(+), 54 deletions(-)

diff --git a/arch/x86/crypto/chacha20_glue.c b/arch/x86/crypto/chacha20_glue.c
index 722bacea040e..8a7f1375ece4 100644
--- a/arch/x86/crypto/chacha20_glue.c
+++ b/arch/x86/crypto/chacha20_glue.c
@@ -125,7 +125,7 @@ static struct crypto_alg alg = {

static int __init chacha20_simd_mod_init(void)
{
- if (!cpu_has_ssse3)
+ if (!static_cpu_has_safe(X86_FEATURE_SSSE3))
return -ENODEV;

#ifdef CONFIG_AS_AVX2
diff --git a/arch/x86/crypto/crc32c-intel_glue.c b/arch/x86/crypto/crc32c-intel_glue.c
index 81a595d75cf5..72a991fb643f 100644
--- a/arch/x86/crypto/crc32c-intel_glue.c
+++ b/arch/x86/crypto/crc32c-intel_glue.c
@@ -256,8 +256,9 @@ static int __init crc32c_intel_mod_init(void)
{
if (!x86_match_cpu(crc32c_cpu_id))
return -ENODEV;
+
#ifdef CONFIG_X86_64
- if (cpu_has_pclmulqdq) {
+ if (static_cpu_has_safe(X86_FEATURE_PCLMULQDQ)) {
alg.update = crc32c_pcl_intel_update;
alg.finup = crc32c_pcl_intel_finup;
alg.digest = crc32c_pcl_intel_digest;
diff --git a/arch/x86/include/asm/cmpxchg_32.h b/arch/x86/include/asm/cmpxchg_32.h
index f7e142926481..2c88969f78c7 100644
--- a/arch/x86/include/asm/cmpxchg_32.h
+++ b/arch/x86/include/asm/cmpxchg_32.h
@@ -109,6 +109,6 @@ static inline u64 __cmpxchg64_local(volatile u64 *ptr, u64 old, u64 new)

#endif

-#define system_has_cmpxchg_double() cpu_has_cx8
+#define system_has_cmpxchg_double() static_cpu_has_safe(X86_FEATURE_CX8)

#endif /* _ASM_X86_CMPXCHG_32_H */
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 35401fef0d75..27ab2e7d14c4 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -390,53 +390,27 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
#define cpu_has_tsc boot_cpu_has(X86_FEATURE_TSC)
#define cpu_has_pge boot_cpu_has(X86_FEATURE_PGE)
#define cpu_has_apic boot_cpu_has(X86_FEATURE_APIC)
-#define cpu_has_sep boot_cpu_has(X86_FEATURE_SEP)
-#define cpu_has_mtrr boot_cpu_has(X86_FEATURE_MTRR)
#define cpu_has_mmx boot_cpu_has(X86_FEATURE_MMX)
#define cpu_has_fxsr boot_cpu_has(X86_FEATURE_FXSR)
#define cpu_has_xmm boot_cpu_has(X86_FEATURE_XMM)
#define cpu_has_xmm2 boot_cpu_has(X86_FEATURE_XMM2)
-#define cpu_has_xmm3 boot_cpu_has(X86_FEATURE_XMM3)
-#define cpu_has_ssse3 boot_cpu_has(X86_FEATURE_SSSE3)
#define cpu_has_aes boot_cpu_has(X86_FEATURE_AES)
#define cpu_has_avx boot_cpu_has(X86_FEATURE_AVX)
#define cpu_has_avx2 boot_cpu_has(X86_FEATURE_AVX2)
-#define cpu_has_ht boot_cpu_has(X86_FEATURE_HT)
-#define cpu_has_nx boot_cpu_has(X86_FEATURE_NX)
-#define cpu_has_xstore boot_cpu_has(X86_FEATURE_XSTORE)
-#define cpu_has_xstore_enabled boot_cpu_has(X86_FEATURE_XSTORE_EN)
-#define cpu_has_xcrypt boot_cpu_has(X86_FEATURE_XCRYPT)
-#define cpu_has_xcrypt_enabled boot_cpu_has(X86_FEATURE_XCRYPT_EN)
-#define cpu_has_ace2 boot_cpu_has(X86_FEATURE_ACE2)
-#define cpu_has_ace2_enabled boot_cpu_has(X86_FEATURE_ACE2_EN)
-#define cpu_has_phe boot_cpu_has(X86_FEATURE_PHE)
-#define cpu_has_phe_enabled boot_cpu_has(X86_FEATURE_PHE_EN)
-#define cpu_has_pmm boot_cpu_has(X86_FEATURE_PMM)
-#define cpu_has_pmm_enabled boot_cpu_has(X86_FEATURE_PMM_EN)
-#define cpu_has_ds boot_cpu_has(X86_FEATURE_DS)
-#define cpu_has_pebs boot_cpu_has(X86_FEATURE_PEBS)
#define cpu_has_clflush boot_cpu_has(X86_FEATURE_CLFLUSH)
-#define cpu_has_bts boot_cpu_has(X86_FEATURE_BTS)
#define cpu_has_gbpages boot_cpu_has(X86_FEATURE_GBPAGES)
#define cpu_has_arch_perfmon boot_cpu_has(X86_FEATURE_ARCH_PERFMON)
#define cpu_has_pat boot_cpu_has(X86_FEATURE_PAT)
-#define cpu_has_xmm4_1 boot_cpu_has(X86_FEATURE_XMM4_1)
-#define cpu_has_xmm4_2 boot_cpu_has(X86_FEATURE_XMM4_2)
#define cpu_has_x2apic boot_cpu_has(X86_FEATURE_X2APIC)
#define cpu_has_xsave boot_cpu_has(X86_FEATURE_XSAVE)
-#define cpu_has_xsaveopt boot_cpu_has(X86_FEATURE_XSAVEOPT)
#define cpu_has_xsaves boot_cpu_has(X86_FEATURE_XSAVES)
#define cpu_has_osxsave boot_cpu_has(X86_FEATURE_OSXSAVE)
#define cpu_has_hypervisor boot_cpu_has(X86_FEATURE_HYPERVISOR)
-#define cpu_has_pclmulqdq boot_cpu_has(X86_FEATURE_PCLMULQDQ)
-#define cpu_has_perfctr_core boot_cpu_has(X86_FEATURE_PERFCTR_CORE)
-#define cpu_has_perfctr_nb boot_cpu_has(X86_FEATURE_PERFCTR_NB)
-#define cpu_has_perfctr_l2 boot_cpu_has(X86_FEATURE_PERFCTR_L2)
-#define cpu_has_cx8 boot_cpu_has(X86_FEATURE_CX8)
#define cpu_has_cx16 boot_cpu_has(X86_FEATURE_CX16)
-#define cpu_has_eager_fpu boot_cpu_has(X86_FEATURE_EAGER_FPU)
#define cpu_has_topoext boot_cpu_has(X86_FEATURE_TOPOEXT)
-#define cpu_has_bpext boot_cpu_has(X86_FEATURE_BPEXT)
+/*
+ * Do not add any more of those clumsy macros - use static_cpu_has_safe()!
+ */

#if __GNUC__ >= 4
extern void warn_pre_alternatives(void);
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 222a6a3ca2b5..a8578776c5cb 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -25,7 +25,7 @@ static inline bool cpu_has_ht_siblings(void)
{
bool has_siblings = false;
#ifdef CONFIG_SMP
- has_siblings = cpu_has_ht && smp_num_siblings > 1;
+ has_siblings = static_cpu_has_safe(X86_FEATURE_HT) && smp_num_siblings > 1;
#endif
return has_siblings;
}
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 4a70fc6d400a..c018ca641112 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -909,7 +909,7 @@ static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum)

void set_dr_addr_mask(unsigned long mask, int dr)
{
- if (!cpu_has_bpext)
+ if (!static_cpu_has_safe(X86_FEATURE_BPEXT))
return;

switch (dr) {
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 98a13db5f4be..5fc71c43dc22 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -444,7 +444,8 @@ static void init_intel(struct cpuinfo_x86 *c)

if (cpu_has_xmm2)
set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC);
- if (cpu_has_ds) {
+
+ if (static_cpu_has_safe(X86_FEATURE_DS)) {
unsigned int l1;
rdmsr(MSR_IA32_MISC_ENABLE, l1, l2);
if (!(l1 & (1<<11)))
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index 3b533cf37c74..8f2ef910c7bf 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -349,7 +349,7 @@ static void get_fixed_ranges(mtrr_type *frs)

void mtrr_save_fixed_ranges(void *info)
{
- if (cpu_has_mtrr)
+ if (static_cpu_has_safe(X86_FEATURE_MTRR))
get_fixed_ranges(mtrr_state.fixed_ranges);
}

diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
index f891b4750f04..2c01181236fc 100644
--- a/arch/x86/kernel/cpu/mtrr/main.c
+++ b/arch/x86/kernel/cpu/mtrr/main.c
@@ -682,7 +682,7 @@ void __init mtrr_bp_init(void)

phys_addr = 32;

- if (cpu_has_mtrr) {
+ if (static_cpu_has_safe(X86_FEATURE_MTRR)) {
mtrr_if = &generic_mtrr_ops;
size_or_mask = SIZE_OR_MASK_BITS(36);
size_and_mask = 0x00f00000;
diff --git a/arch/x86/kernel/cpu/perf_event_amd.c b/arch/x86/kernel/cpu/perf_event_amd.c
index 1cee5d2d7ece..86fac8bf627c 100644
--- a/arch/x86/kernel/cpu/perf_event_amd.c
+++ b/arch/x86/kernel/cpu/perf_event_amd.c
@@ -160,7 +160,7 @@ static inline int amd_pmu_addr_offset(int index, bool eventsel)
if (offset)
return offset;

- if (!cpu_has_perfctr_core)
+ if (!static_cpu_has_safe(X86_FEATURE_PERFCTR_CORE))
offset = index;
else
offset = index << 1;
@@ -652,7 +652,7 @@ static __initconst const struct x86_pmu amd_pmu = {

static int __init amd_core_pmu_init(void)
{
- if (!cpu_has_perfctr_core)
+ if (!static_cpu_has_safe(X86_FEATURE_PERFCTR_CORE))
return 0;

switch (boot_cpu_data.x86) {
diff --git a/arch/x86/kernel/cpu/perf_event_amd_uncore.c b/arch/x86/kernel/cpu/perf_event_amd_uncore.c
index cc6cedb8f25d..533e5eab7d94 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_uncore.c
@@ -526,7 +526,7 @@ static int __init amd_uncore_init(void)
if (!cpu_has_topoext)
goto fail_nodev;

- if (cpu_has_perfctr_nb) {
+ if (static_cpu_has_safe(X86_FEATURE_PERFCTR_NB)) {
amd_uncore_nb = alloc_percpu(struct amd_uncore *);
if (!amd_uncore_nb) {
ret = -ENOMEM;
@@ -540,7 +540,7 @@ static int __init amd_uncore_init(void)
ret = 0;
}

- if (cpu_has_perfctr_l2) {
+ if (static_cpu_has_safe(X86_FEATURE_PERFCTR_L2)) {
amd_uncore_l2 = alloc_percpu(struct amd_uncore *);
if (!amd_uncore_l2) {
ret = -ENOMEM;
@@ -583,10 +583,10 @@ fail_online:

/* amd_uncore_nb/l2 should have been freed by cleanup_cpu_online */
amd_uncore_nb = amd_uncore_l2 = NULL;
- if (cpu_has_perfctr_l2)
+ if (static_cpu_has_safe(X86_FEATURE_PERFCTR_L2))
perf_pmu_unregister(&amd_l2_pmu);
fail_l2:
- if (cpu_has_perfctr_nb)
+ if (static_cpu_has_safe(X86_FEATURE_PERFCTR_NB))
perf_pmu_unregister(&amd_nb_pmu);
if (amd_uncore_l2)
free_percpu(amd_uncore_l2);
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index be39b5fde4b9..c5674bbecc85 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -12,7 +12,7 @@
*/
static void fpu__init_cpu_ctx_switch(void)
{
- if (!cpu_has_eager_fpu)
+ if (!static_cpu_has_safe(X86_FEATURE_EAGER_FPU))
stts();
else
clts();
@@ -287,7 +287,7 @@ static void __init fpu__init_system_ctx_switch(void)
current_thread_info()->status = 0;

/* Auto enable eagerfpu for xsaveopt */
- if (cpu_has_xsaveopt && eagerfpu != DISABLE)
+ if (static_cpu_has_safe(X86_FEATURE_XSAVEOPT) && eagerfpu != DISABLE)
eagerfpu = ENABLE;

if (xfeatures_mask & XFEATURE_MASK_EAGER) {
diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index 50a3fad5b89f..27c69ec5fc30 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -307,8 +307,9 @@ static int arch_build_bp_info(struct perf_event *bp)
* breakpoints, then we'll have to check for kprobe-blacklisted
* addresses anywhere in the range.
*/
- if (!cpu_has_bpext)
+ if (!static_cpu_has_safe(X86_FEATURE_BPEXT))
return -EOPNOTSUPP;
+
info->mask = bp->attr.bp_len - 1;
info->len = X86_BREAKPOINT_LEN_1;
}
diff --git a/arch/x86/kernel/vm86_32.c b/arch/x86/kernel/vm86_32.c
index 524619351961..483231ebbb0b 100644
--- a/arch/x86/kernel/vm86_32.c
+++ b/arch/x86/kernel/vm86_32.c
@@ -357,8 +357,10 @@ static long do_sys_vm86(struct vm86plus_struct __user *user_vm86, bool plus)
tss = &per_cpu(cpu_tss, get_cpu());
/* make room for real-mode segments */
tsk->thread.sp0 += 16;
- if (cpu_has_sep)
+
+ if (static_cpu_has_safe(X86_FEATURE_SEP))
tsk->thread.sysenter_cs = 0;
+
load_sp0(tss, &tsk->thread);
put_cpu();

diff --git a/arch/x86/mm/setup_nx.c b/arch/x86/mm/setup_nx.c
index 90555bf60aa4..595dc8e019a1 100644
--- a/arch/x86/mm/setup_nx.c
+++ b/arch/x86/mm/setup_nx.c
@@ -31,7 +31,7 @@ early_param("noexec", noexec_setup);

void x86_configure_nx(void)
{
- if (cpu_has_nx && !disable_nx)
+ if (static_cpu_has_safe(X86_FEATURE_NX) && !disable_nx)
__supported_pte_mask |= _PAGE_NX;
else
__supported_pte_mask &= ~_PAGE_NX;
@@ -39,7 +39,7 @@ void x86_configure_nx(void)

void __init x86_report_nx(void)
{
- if (!cpu_has_nx) {
+ if (!static_cpu_has_safe(X86_FEATURE_NX)) {
printk(KERN_NOTICE "Notice: NX (Execute Disable) protection "
"missing in CPU!\n");
} else {
diff --git a/drivers/char/hw_random/via-rng.c b/drivers/char/hw_random/via-rng.c
index 0c98a9d51a24..6052f619fd39 100644
--- a/drivers/char/hw_random/via-rng.c
+++ b/drivers/char/hw_random/via-rng.c
@@ -140,7 +140,7 @@ static int via_rng_init(struct hwrng *rng)
* RNG configuration like it used to be the case in this
* register */
if ((c->x86 == 6) && (c->x86_model >= 0x0f)) {
- if (!cpu_has_xstore_enabled) {
+ if (!static_cpu_has_safe(X86_FEATURE_XSTORE_EN)) {
pr_err(PFX "can't enable hardware RNG "
"if XSTORE is not enabled\n");
return -ENODEV;
@@ -200,8 +200,9 @@ static int __init mod_init(void)
{
int err;

- if (!cpu_has_xstore)
+ if (!static_cpu_has_safe(X86_FEATURE_XSTORE))
return -ENODEV;
+
pr_info("VIA RNG detected\n");
err = hwrng_register(&via_rng);
if (err) {
diff --git a/drivers/crypto/padlock-aes.c b/drivers/crypto/padlock-aes.c
index da2d6777bd09..360e941968dd 100644
--- a/drivers/crypto/padlock-aes.c
+++ b/drivers/crypto/padlock-aes.c
@@ -515,7 +515,7 @@ static int __init padlock_init(void)
if (!x86_match_cpu(padlock_cpu_id))
return -ENODEV;

- if (!cpu_has_xcrypt_enabled) {
+ if (!static_cpu_has_safe(X86_FEATURE_XCRYPT_EN)) {
printk(KERN_NOTICE PFX "VIA PadLock detected, but not enabled. Hmm, strange...\n");
return -ENODEV;
}
diff --git a/drivers/crypto/padlock-sha.c b/drivers/crypto/padlock-sha.c
index 4e154c9b9206..18fec3381178 100644
--- a/drivers/crypto/padlock-sha.c
+++ b/drivers/crypto/padlock-sha.c
@@ -540,7 +540,8 @@ static int __init padlock_init(void)
struct shash_alg *sha1;
struct shash_alg *sha256;

- if (!x86_match_cpu(padlock_sha_ids) || !cpu_has_phe_enabled)
+ if (!x86_match_cpu(padlock_sha_ids) ||
+ !static_cpu_has_safe(X86_FEATURE_PHE_EN))
return -ENODEV;

/* Register the newly added algorithm module if on *
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 640598c0d0e7..d4544172e399 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -923,7 +923,7 @@ static int check_async_write(struct inode *inode, unsigned long bio_flags)
if (bio_flags & EXTENT_BIO_TREE_LOG)
return 0;
#ifdef CONFIG_X86
- if (cpu_has_xmm4_2)
+ if (static_cpu_has_safe(X86_FEATURE_XMM4_2))
return 0;
#endif
return 1;
--
2.3.5

2015-11-10 11:59:08

by David Sterba

[permalink] [raw]
Subject: Re: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros

On Tue, Nov 10, 2015 at 12:48:42PM +0100, Borislav Petkov wrote:
> From: Borislav Petkov <[email protected]>
>
> Those are stupid and code should use static_cpu_has_safe() anyway. Kill
> the least used and unused ones.
>
> Signed-off-by: Borislav Petkov <[email protected]>
> Cc: Herbert Xu <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Matt Mackall <[email protected]>
> Cc: Chris Mason <[email protected]>
> Cc: Josef Bacik <[email protected]>
> Cc: David Sterba <[email protected]>

Acked-by: David Sterba <[email protected]>

2015-11-10 12:30:08

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros


* Borislav Petkov <[email protected]> wrote:

> From: Borislav Petkov <[email protected]>
>
> Those are stupid and code should use static_cpu_has_safe() anyway. Kill
> the least used and unused ones.

So cpufeature.h doesn't really do a good job of explaining what the difference is
between all these variants:

cpu_has()
static_cpu_has()
static_cpu_has_safe()

it has this comment:

/*
* Static testing of CPU features. Used the same as boot_cpu_has().
* These are only valid after alternatives have run, but will statically
* patch the target code for additional performance.
*/

The second sentence does not parse. Why does the third sentence have a 'but' for
listing properties? It's either bad grammer or tries to tell something that isn't
being told properly.

It's entirely silent on the difference between static_cpu_has() and
static_cpu_has_safe() - what makes the second one 'safe'?

Thanks,

Ingo

2015-11-10 12:47:46

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros

On Tue, Nov 10, 2015 at 01:30:00PM +0100, Ingo Molnar wrote:
...

> It's entirely silent on the difference between static_cpu_has() and
> static_cpu_has_safe() - what makes the second one 'safe'?

Yeah, those really are lacking some more fleshy and detailed
explanations. :-)

I'll document those properly.

Thanks.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

2015-11-18 18:23:19

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros

On Tue, Nov 10, 2015 at 01:30:00PM +0100, Ingo Molnar wrote:
>
> * Borislav Petkov <[email protected]> wrote:
>
> > From: Borislav Petkov <[email protected]>
> >
> > Those are stupid and code should use static_cpu_has_safe() anyway. Kill
> > the least used and unused ones.
>
> So cpufeature.h doesn't really do a good job of explaining what the difference is
> between all these variants:

How's that for starters?

---
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 27ab2e7d14c4..a9a8313e278e 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -351,6 +351,10 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
(((bit)>>5)==8 && (1UL<<((bit)&31) & DISABLED_MASK8)) || \
(((bit)>>5)==9 && (1UL<<((bit)&31) & DISABLED_MASK9)) )

+/*
+ * Test whether the CPU represented by descriptor @c has the feature bit @bit
+ * set.
+ */
#define cpu_has(c, bit) \
(__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 : \
test_cpu_cap(c, bit))
@@ -416,11 +420,6 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
extern void warn_pre_alternatives(void);
extern bool __static_cpu_has_safe(u16 bit);

-/*
- * Static testing of CPU features. Used the same as boot_cpu_has().
- * These are only valid after alternatives have run, but will statically
- * patch the target code for additional performance.
- */
static __always_inline __pure bool __static_cpu_has(u16 bit)
{
#ifdef CC_HAVE_ASM_GOTO
@@ -495,6 +494,18 @@ static __always_inline __pure bool __static_cpu_has(u16 bit)
#endif /* CC_HAVE_ASM_GOTO */
}

+/*
+ * Test whether the boot CPU has feature bit @bit enabled.
+ *
+ * This is static testing of CPU features. It is used in the same manner as
+ * boot_cpu_has(). It is differs from the previous one in that the alternatives
+ * infrastructure will statically patch the code where the test is performed for
+ * additional performance.
+ *
+ * However, results from that macro are only valid after the alternatives have
+ * run and not before that. IOW, you want static_cpu_has_safe() instead, see
+ * below.
+ */
#define static_cpu_has(bit) \
( \
__builtin_constant_p(boot_cpu_has(bit)) ? \
@@ -580,6 +591,11 @@ static __always_inline __pure bool _static_cpu_has_safe(u16 bit)
#endif /* CC_HAVE_ASM_GOTO */
}

+/*
+ * Like static_cpu_has() above but it works even before the alternatives have
+ * run by falling back to boot_cpu_has(). You should use that macro for all your
+ * CPU feature bit testing needs.
+ */
#define static_cpu_has_safe(bit) \
( \
__builtin_constant_p(boot_cpu_has(bit)) ? \

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

2015-11-24 13:05:17

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros

On Tue, Nov 10, 2015 at 12:48:42PM +0100, Borislav Petkov wrote:
> From: Borislav Petkov <[email protected]>
>
> Those are stupid and code should use static_cpu_has_safe() anyway. Kill
> the least used and unused ones.
>
> Signed-off-by: Borislav Petkov <[email protected]>
> Cc: Herbert Xu <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Matt Mackall <[email protected]>
> Cc: Chris Mason <[email protected]>
> Cc: Josef Bacik <[email protected]>
> Cc: David Sterba <[email protected]>
> ---
> arch/x86/crypto/chacha20_glue.c | 2 +-
> arch/x86/crypto/crc32c-intel_glue.c | 3 ++-
> arch/x86/include/asm/cmpxchg_32.h | 2 +-
> arch/x86/include/asm/cpufeature.h | 32 +++--------------------------
> arch/x86/include/asm/smp.h | 2 +-
> arch/x86/kernel/cpu/amd.c | 2 +-
> arch/x86/kernel/cpu/intel.c | 3 ++-
> arch/x86/kernel/cpu/mtrr/generic.c | 2 +-
> arch/x86/kernel/cpu/mtrr/main.c | 2 +-
> arch/x86/kernel/cpu/perf_event_amd.c | 4 ++--
> arch/x86/kernel/cpu/perf_event_amd_uncore.c | 8 ++++----
> arch/x86/kernel/fpu/init.c | 4 ++--
> arch/x86/kernel/hw_breakpoint.c | 3 ++-
> arch/x86/kernel/vm86_32.c | 4 +++-
> arch/x86/mm/setup_nx.c | 4 ++--
> drivers/char/hw_random/via-rng.c | 5 +++--
> drivers/crypto/padlock-aes.c | 2 +-
> drivers/crypto/padlock-sha.c | 3 ++-
> fs/btrfs/disk-io.c | 2 +-
> 19 files changed, 35 insertions(+), 54 deletions(-)

Ok, 0day says this patch makes tiny not so tiny:

i386-tinyconfig vmlinux size:

+-------+------+-------+-----+--------------------------------------------------------------------------------------+
| TOTAL | TEXT | DATA | BSS | |
+-------+------+-------+-----+--------------------------------------------------------------------------------------+
| +4646 | +64 | +4096 | 0 | ab9976b5af96 x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros |
| -32 | -32 | 0 | 0 | 13e835020a02 x86/cpufeature: Cleanup get_cpu_cap() |
| +32 | +32 | 0 | 0 | 3615f94f0486 x86/cpufeature: Move some of the scattered feature bits to x86_capabili |
| +136 | +32 | 0 | 0 | 506d983184f4 Merge branch 'tip-fpu-xsave' into rc2+ |
| +4782 | +96 | +4096 | 0 | ALL COMMITS |
+-------+------+-------+-----+--------------------------------------------------------------------------------------+

Btw, thanks 0day!

The problem comes from static_cpu_has_safe() adding the alternatives and
fallback machinery. For example, before the patch, we had this at the
cpu_has_* testing sites:

movl boot_cpu_data+20, %eax # MEM[(const long unsigned int *)&boot_cpu_data + 20B], D.19113
testl $2097152, %eax #, D.19113
je .L166 #,

and now we get this:

#APP
# 449 "arch/x86/kernel/cpu/intel.c" 1
# 0 "" 2
# 511 "./arch/x86/include/asm/cpufeature.h" 1
1: jmp .L166 #
2:
.skip -(((5f-4f) - (2b-1b)) > 0) * ((5f-4f) - (2b-1b)),0x90
3:
.section .altinstructions,"a"
.long 1b - .
.long 4f - .
.word 117 #
.byte 3b - 1b
.byte 5f - 4f
.byte 3b - 2b
.previous
.section .altinstr_replacement,"ax"
4: jmp .L167 #
5:
.previous
.section .altinstructions,"a"
.long 1b - .
.long 0
.word 21 #
.byte 3b - 1b
.byte 0
.byte 0
.previous

# 0 "" 2
#NO_APP
jmp .L168 #
.L166:
movl $21, %eax #,
call __static_cpu_has_safe #
testb %al, %al # D.19126
je .L167 #,
.L168:
#APP
# 453 "arch/x86/kernel/cpu/intel.c" 1
# 0 "" 2
#NO_APP

That gets spread among .altinstructions, .altinstr_replacement, .text
etc sections. .data grows too probably because of the NOP padding :-\

text data bss dec hex filename
before: 644896 127436 1189384 1961716 1deef4 vmlinux
after: 645446 131532 1189384 1966362 1e011a vmlinux

[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
before: [12] .altinstructions PROGBITS c10bdf48 0bef48 000680 00 A 0 0 1
after: [12] .altinstructions PROGBITS c10bff48 0c0f48 0007d2 00 A 0 0 1

before: [13] .altinstr_replace PROGBITS c10be5c8 0bf5c8 00016c 00 AX 0 0 1
after: [13] .altinstr_replace PROGBITS c10c071a 0c171a 0001ad 00 AX 0 0 1

before: [ 7] .data PROGBITS c1092000 093000 0132a0 00 WA 0 0 4096
after: [ 7] .data PROGBITS c1093000 094000 0142a0 00 WA 0 0 4096

So I'm wondering if we should make a config option which converts
static_cpu_has* macros to boot_cpu_has()? That should slim down
the kernel even more but it won't benefit from the speedup of the
static_cpu_has* stuff.

Josh, thoughts?

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

2015-11-24 22:42:33

by Josh Triplett

[permalink] [raw]
Subject: Re: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros

On Tue, Nov 24, 2015 at 02:05:10PM +0100, Borislav Petkov wrote:
> On Tue, Nov 10, 2015 at 12:48:42PM +0100, Borislav Petkov wrote:
> > From: Borislav Petkov <[email protected]>
> >
> > Those are stupid and code should use static_cpu_has_safe() anyway. Kill
> > the least used and unused ones.
> >
> > Signed-off-by: Borislav Petkov <[email protected]>
> > Cc: Herbert Xu <[email protected]>
> > Cc: Peter Zijlstra <[email protected]>
> > Cc: Matt Mackall <[email protected]>
> > Cc: Chris Mason <[email protected]>
> > Cc: Josef Bacik <[email protected]>
> > Cc: David Sterba <[email protected]>
> > ---
> > arch/x86/crypto/chacha20_glue.c | 2 +-
> > arch/x86/crypto/crc32c-intel_glue.c | 3 ++-
> > arch/x86/include/asm/cmpxchg_32.h | 2 +-
> > arch/x86/include/asm/cpufeature.h | 32 +++--------------------------
> > arch/x86/include/asm/smp.h | 2 +-
> > arch/x86/kernel/cpu/amd.c | 2 +-
> > arch/x86/kernel/cpu/intel.c | 3 ++-
> > arch/x86/kernel/cpu/mtrr/generic.c | 2 +-
> > arch/x86/kernel/cpu/mtrr/main.c | 2 +-
> > arch/x86/kernel/cpu/perf_event_amd.c | 4 ++--
> > arch/x86/kernel/cpu/perf_event_amd_uncore.c | 8 ++++----
> > arch/x86/kernel/fpu/init.c | 4 ++--
> > arch/x86/kernel/hw_breakpoint.c | 3 ++-
> > arch/x86/kernel/vm86_32.c | 4 +++-
> > arch/x86/mm/setup_nx.c | 4 ++--
> > drivers/char/hw_random/via-rng.c | 5 +++--
> > drivers/crypto/padlock-aes.c | 2 +-
> > drivers/crypto/padlock-sha.c | 3 ++-
> > fs/btrfs/disk-io.c | 2 +-
> > 19 files changed, 35 insertions(+), 54 deletions(-)
>
> Ok, 0day says this patch makes tiny not so tiny:
>
> i386-tinyconfig vmlinux size:
>
> +-------+------+-------+-----+--------------------------------------------------------------------------------------+
> | TOTAL | TEXT | DATA | BSS | |
> +-------+------+-------+-----+--------------------------------------------------------------------------------------+
> | +4646 | +64 | +4096 | 0 | ab9976b5af96 x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros |
> | -32 | -32 | 0 | 0 | 13e835020a02 x86/cpufeature: Cleanup get_cpu_cap() |
> | +32 | +32 | 0 | 0 | 3615f94f0486 x86/cpufeature: Move some of the scattered feature bits to x86_capabili |
> | +136 | +32 | 0 | 0 | 506d983184f4 Merge branch 'tip-fpu-xsave' into rc2+ |
> | +4782 | +96 | +4096 | 0 | ALL COMMITS |
> +-------+------+-------+-----+--------------------------------------------------------------------------------------+
>
> Btw, thanks 0day!

Yay, it worked! Thanks for paying attention to this.

> The problem comes from static_cpu_has_safe() adding the alternatives and
> fallback machinery. For example, before the patch, we had this at the
> cpu_has_* testing sites:
>
> movl boot_cpu_data+20, %eax # MEM[(const long unsigned int *)&boot_cpu_data + 20B], D.19113
> testl $2097152, %eax #, D.19113
> je .L166 #,
>
> and now we get this:
>
> #APP
> # 449 "arch/x86/kernel/cpu/intel.c" 1
> # 0 "" 2
> # 511 "./arch/x86/include/asm/cpufeature.h" 1
> 1: jmp .L166 #
> 2:
> .skip -(((5f-4f) - (2b-1b)) > 0) * ((5f-4f) - (2b-1b)),0x90
> 3:
> .section .altinstructions,"a"
> .long 1b - .
> .long 4f - .
> .word 117 #
> .byte 3b - 1b
> .byte 5f - 4f
> .byte 3b - 2b
> .previous
> .section .altinstr_replacement,"ax"
> 4: jmp .L167 #
> 5:
> .previous
> .section .altinstructions,"a"
> .long 1b - .
> .long 0
> .word 21 #
> .byte 3b - 1b
> .byte 0
> .byte 0
> .previous
>
> # 0 "" 2
> #NO_APP
> jmp .L168 #
> .L166:
> movl $21, %eax #,
> call __static_cpu_has_safe #
> testb %al, %al # D.19126
> je .L167 #,
> .L168:
> #APP
> # 453 "arch/x86/kernel/cpu/intel.c" 1
> # 0 "" 2
> #NO_APP
>
> That gets spread among .altinstructions, .altinstr_replacement, .text
> etc sections. .data grows too probably because of the NOP padding :-\

Yeah, padding makes the evaluation of some section sizes painful.

That said: .data? I don't quite see how that happened.

> text data bss dec hex filename
> before: 644896 127436 1189384 1961716 1deef4 vmlinux
> after: 645446 131532 1189384 1966362 1e011a vmlinux
>
> [Nr] Name Type Addr Off Size ES Flg Lk Inf Al
> before: [12] .altinstructions PROGBITS c10bdf48 0bef48 000680 00 A 0 0 1
> after: [12] .altinstructions PROGBITS c10bff48 0c0f48 0007d2 00 A 0 0 1
>
> before: [13] .altinstr_replace PROGBITS c10be5c8 0bf5c8 00016c 00 AX 0 0 1
> after: [13] .altinstr_replace PROGBITS c10c071a 0c171a 0001ad 00 AX 0 0 1
>
> before: [ 7] .data PROGBITS c1092000 093000 0132a0 00 WA 0 0 4096
> after: [ 7] .data PROGBITS c1093000 094000 0142a0 00 WA 0 0 4096
>
> So I'm wondering if we should make a config option which converts
> static_cpu_has* macros to boot_cpu_has()? That should slim down
> the kernel even more but it won't benefit from the speedup of the
> static_cpu_has* stuff.
>
> Josh, thoughts?

Seems like a good idea to me: that would sacrifice a small amount of
runtime performance in favor of code size. (Note that the config option
should use static_cpu_has when =y, and the slower, smaller method when
=n, so that "allnoconfig" can DTRT.)

Given that many embedded systems will know exactly what CPU they want to
run on, I'd also love to see a way to set the capabilities of the CPU at
compile time, so that all those checks (and the code within them) can
constant-fold away.

- Josh Triplett

2015-11-25 00:11:13

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros

On Tue, Nov 24, 2015 at 2:42 PM, Josh Triplett <[email protected]> wrote:
>> text data bss dec hex filename
>> before: 644896 127436 1189384 1961716 1deef4 vmlinux
>> after: 645446 131532 1189384 1966362 1e011a vmlinux
>>
>> [Nr] Name Type Addr Off Size ES Flg Lk Inf Al
>> before: [12] .altinstructions PROGBITS c10bdf48 0bef48 000680 00 A 0 0 1
>> after: [12] .altinstructions PROGBITS c10bff48 0c0f48 0007d2 00 A 0 0 1
>>
>> before: [13] .altinstr_replace PROGBITS c10be5c8 0bf5c8 00016c 00 AX 0 0 1
>> after: [13] .altinstr_replace PROGBITS c10c071a 0c171a 0001ad 00 AX 0 0 1
>>
>> before: [ 7] .data PROGBITS c1092000 093000 0132a0 00 WA 0 0 4096
>> after: [ 7] .data PROGBITS c1093000 094000 0142a0 00 WA 0 0 4096
>>
>> So I'm wondering if we should make a config option which converts
>> static_cpu_has* macros to boot_cpu_has()? That should slim down
>> the kernel even more but it won't benefit from the speedup of the
>> static_cpu_has* stuff.
>>
>> Josh, thoughts?
>
> Seems like a good idea to me: that would sacrifice a small amount of
> runtime performance in favor of code size. (Note that the config option
> should use static_cpu_has when =y, and the slower, smaller method when
> =n, so that "allnoconfig" can DTRT.)
>
> Given that many embedded systems will know exactly what CPU they want to
> run on, I'd also love to see a way to set the capabilities of the CPU at
> compile time, so that all those checks (and the code within them) can
> constant-fold away.
>

As another idea, the alternatives infrastructure could plausibly be
rearranged so that it never exists in memory in decompressed form. We
could decompress it streamily and process it as we go.

--Andy

2015-11-25 02:58:31

by Josh Triplett

[permalink] [raw]
Subject: Re: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros

On November 24, 2015 4:10:48 PM PST, Andy Lutomirski <[email protected]> wrote:
>On Tue, Nov 24, 2015 at 2:42 PM, Josh Triplett <[email protected]>
>wrote:
>>> text data bss dec hex filename
>>> before: 644896 127436 1189384 1961716 1deef4
>vmlinux
>>> after: 645446 131532 1189384 1966362 1e011a
>vmlinux
>>>
>>> [Nr] Name Type Addr Off Size
>ES Flg Lk Inf Al
>>> before: [12] .altinstructions PROGBITS c10bdf48 0bef48
>000680 00 A 0 0 1
>>> after: [12] .altinstructions PROGBITS c10bff48 0c0f48
>0007d2 00 A 0 0 1
>>>
>>> before: [13] .altinstr_replace PROGBITS c10be5c8 0bf5c8
>00016c 00 AX 0 0 1
>>> after: [13] .altinstr_replace PROGBITS c10c071a 0c171a
>0001ad 00 AX 0 0 1
>>>
>>> before: [ 7] .data PROGBITS c1092000 093000
>0132a0 00 WA 0 0 4096
>>> after: [ 7] .data PROGBITS c1093000 094000
>0142a0 00 WA 0 0 4096
>>>
>>> So I'm wondering if we should make a config option which converts
>>> static_cpu_has* macros to boot_cpu_has()? That should slim down
>>> the kernel even more but it won't benefit from the speedup of the
>>> static_cpu_has* stuff.
>>>
>>> Josh, thoughts?
>>
>> Seems like a good idea to me: that would sacrifice a small amount of
>> runtime performance in favor of code size. (Note that the config
>option
>> should use static_cpu_has when =y, and the slower, smaller method
>when
>> =n, so that "allnoconfig" can DTRT.)
>>
>> Given that many embedded systems will know exactly what CPU they want
>to
>> run on, I'd also love to see a way to set the capabilities of the CPU
>at
>> compile time, so that all those checks (and the code within them) can
>> constant-fold away.
>>
>
>As another idea, the alternatives infrastructure could plausibly be
>rearranged so that it never exists in memory in decompressed form. We
>could decompress it streamily and process it as we go.

That doesn't help when running the uncompressed kernel in place, though. It'd be nice if every use of alternatives and similar mechanisms supported build-time resolution.

2015-11-27 13:53:12

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros

On Tue, Nov 24, 2015 at 02:42:11PM -0800, Josh Triplett wrote:
> > So I'm wondering if we should make a config option which converts
> > static_cpu_has* macros to boot_cpu_has()? That should slim down
> > the kernel even more but it won't benefit from the speedup of the
> > static_cpu_has* stuff.
> >
> > Josh, thoughts?
>
> Seems like a good idea to me: that would sacrifice a small amount of
> runtime performance in favor of code size. (Note that the config option
> should use static_cpu_has when =y, and the slower, smaller method when
> =n, so that "allnoconfig" can DTRT.)

Yeah, so first things first.

Concerning the current issue, I went and converted the majority of
macros to use boot_cpu_has() after all. Majority of the paths are not
hot ones but mostly init paths so static_cpu_has_safe() doesn't make any
sense there.

Result is below and the whole rework *actually* slims down tinyconfig
when patches are applied ontop of rc2 + tip/master:


commit .TEXT .DATA .BSS
rc2+ 650055 127948 1189128
0a53df8a1a3a ("x86/cpufeature: Move some of the...") 649863 127948 1189384
ed03a85e6575 ("x86/cpufeature: Cleanup get_cpu_cap()") 649831 127948 1189384
acde56aeda14 ("x86/cpufeature: Remove unused and...") 649831 127948 1189384

I'll look at doing the macro thing now, hopefully it doesn't get too ugly.

---
From: Borislav Petkov <[email protected]>
Date: Mon, 9 Nov 2015 10:38:45 +0100
Subject: [PATCH] x86/cpufeature: Remove unused and seldomly used cpu_has_xx
macros

Those are stupid and code should use static_cpu_has_safe() or
boot_cpu_has() instead. Kill the least used and unused ones.

The remaining ones need more careful inspection before a conversion can
happen. On the TODO.

Signed-off-by: Borislav Petkov <[email protected]>
Cc: David Sterba <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Matt Mackall <[email protected]>
Cc: Chris Mason <[email protected]>
Cc: Josef Bacik <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/crypto/chacha20_glue.c | 2 +-
arch/x86/crypto/crc32c-intel_glue.c | 2 +-
arch/x86/include/asm/cmpxchg_32.h | 2 +-
arch/x86/include/asm/cmpxchg_64.h | 2 +-
arch/x86/include/asm/cpufeature.h | 37 ++++-------------------------
arch/x86/include/asm/xor_32.h | 2 +-
arch/x86/kernel/cpu/amd.c | 4 ++--
arch/x86/kernel/cpu/common.c | 4 +++-
arch/x86/kernel/cpu/intel.c | 3 ++-
arch/x86/kernel/cpu/intel_cacheinfo.c | 6 ++---
arch/x86/kernel/cpu/mtrr/generic.c | 2 +-
arch/x86/kernel/cpu/mtrr/main.c | 2 +-
arch/x86/kernel/cpu/perf_event_amd.c | 4 ++--
arch/x86/kernel/cpu/perf_event_amd_uncore.c | 11 +++++----
arch/x86/kernel/fpu/init.c | 4 ++--
arch/x86/kernel/hw_breakpoint.c | 6 +++--
arch/x86/kernel/smpboot.c | 2 +-
arch/x86/kernel/vm86_32.c | 4 +++-
arch/x86/mm/setup_nx.c | 4 ++--
drivers/char/hw_random/via-rng.c | 5 ++--
drivers/crypto/padlock-aes.c | 2 +-
drivers/crypto/padlock-sha.c | 2 +-
drivers/iommu/intel_irq_remapping.c | 2 +-
fs/btrfs/disk-io.c | 2 +-
24 files changed, 48 insertions(+), 68 deletions(-)

diff --git a/arch/x86/crypto/chacha20_glue.c b/arch/x86/crypto/chacha20_glue.c
index 722bacea040e..8baaff5af0b5 100644
--- a/arch/x86/crypto/chacha20_glue.c
+++ b/arch/x86/crypto/chacha20_glue.c
@@ -125,7 +125,7 @@ static struct crypto_alg alg = {

static int __init chacha20_simd_mod_init(void)
{
- if (!cpu_has_ssse3)
+ if (!boot_cpu_has(X86_FEATURE_SSSE3))
return -ENODEV;

#ifdef CONFIG_AS_AVX2
diff --git a/arch/x86/crypto/crc32c-intel_glue.c b/arch/x86/crypto/crc32c-intel_glue.c
index 81a595d75cf5..0e9871693f24 100644
--- a/arch/x86/crypto/crc32c-intel_glue.c
+++ b/arch/x86/crypto/crc32c-intel_glue.c
@@ -257,7 +257,7 @@ static int __init crc32c_intel_mod_init(void)
if (!x86_match_cpu(crc32c_cpu_id))
return -ENODEV;
#ifdef CONFIG_X86_64
- if (cpu_has_pclmulqdq) {
+ if (boot_cpu_has(X86_FEATURE_PCLMULQDQ)) {
alg.update = crc32c_pcl_intel_update;
alg.finup = crc32c_pcl_intel_finup;
alg.digest = crc32c_pcl_intel_digest;
diff --git a/arch/x86/include/asm/cmpxchg_32.h b/arch/x86/include/asm/cmpxchg_32.h
index f7e142926481..e4959d023af8 100644
--- a/arch/x86/include/asm/cmpxchg_32.h
+++ b/arch/x86/include/asm/cmpxchg_32.h
@@ -109,6 +109,6 @@ static inline u64 __cmpxchg64_local(volatile u64 *ptr, u64 old, u64 new)

#endif

-#define system_has_cmpxchg_double() cpu_has_cx8
+#define system_has_cmpxchg_double() boot_cpu_has(X86_FEATURE_CX8)

#endif /* _ASM_X86_CMPXCHG_32_H */
diff --git a/arch/x86/include/asm/cmpxchg_64.h b/arch/x86/include/asm/cmpxchg_64.h
index 1af94697aae5..caa23a34c963 100644
--- a/arch/x86/include/asm/cmpxchg_64.h
+++ b/arch/x86/include/asm/cmpxchg_64.h
@@ -18,6 +18,6 @@ static inline void set_64bit(volatile u64 *ptr, u64 val)
cmpxchg_local((ptr), (o), (n)); \
})

-#define system_has_cmpxchg_double() cpu_has_cx16
+#define system_has_cmpxchg_double() boot_cpu_has(X86_FEATURE_CX16)

#endif /* _ASM_X86_CMPXCHG_64_H */
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 604f63695d7d..cbe390044a7c 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -386,58 +386,29 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
} while (0)

#define cpu_has_fpu boot_cpu_has(X86_FEATURE_FPU)
-#define cpu_has_de boot_cpu_has(X86_FEATURE_DE)
#define cpu_has_pse boot_cpu_has(X86_FEATURE_PSE)
#define cpu_has_tsc boot_cpu_has(X86_FEATURE_TSC)
#define cpu_has_pge boot_cpu_has(X86_FEATURE_PGE)
#define cpu_has_apic boot_cpu_has(X86_FEATURE_APIC)
-#define cpu_has_sep boot_cpu_has(X86_FEATURE_SEP)
-#define cpu_has_mtrr boot_cpu_has(X86_FEATURE_MTRR)
-#define cpu_has_mmx boot_cpu_has(X86_FEATURE_MMX)
#define cpu_has_fxsr boot_cpu_has(X86_FEATURE_FXSR)
#define cpu_has_xmm boot_cpu_has(X86_FEATURE_XMM)
#define cpu_has_xmm2 boot_cpu_has(X86_FEATURE_XMM2)
-#define cpu_has_xmm3 boot_cpu_has(X86_FEATURE_XMM3)
-#define cpu_has_ssse3 boot_cpu_has(X86_FEATURE_SSSE3)
#define cpu_has_aes boot_cpu_has(X86_FEATURE_AES)
#define cpu_has_avx boot_cpu_has(X86_FEATURE_AVX)
#define cpu_has_avx2 boot_cpu_has(X86_FEATURE_AVX2)
-#define cpu_has_ht boot_cpu_has(X86_FEATURE_HT)
-#define cpu_has_nx boot_cpu_has(X86_FEATURE_NX)
-#define cpu_has_xstore boot_cpu_has(X86_FEATURE_XSTORE)
-#define cpu_has_xstore_enabled boot_cpu_has(X86_FEATURE_XSTORE_EN)
-#define cpu_has_xcrypt boot_cpu_has(X86_FEATURE_XCRYPT)
-#define cpu_has_xcrypt_enabled boot_cpu_has(X86_FEATURE_XCRYPT_EN)
-#define cpu_has_ace2 boot_cpu_has(X86_FEATURE_ACE2)
-#define cpu_has_ace2_enabled boot_cpu_has(X86_FEATURE_ACE2_EN)
-#define cpu_has_phe boot_cpu_has(X86_FEATURE_PHE)
-#define cpu_has_phe_enabled boot_cpu_has(X86_FEATURE_PHE_EN)
-#define cpu_has_pmm boot_cpu_has(X86_FEATURE_PMM)
-#define cpu_has_pmm_enabled boot_cpu_has(X86_FEATURE_PMM_EN)
-#define cpu_has_ds boot_cpu_has(X86_FEATURE_DS)
-#define cpu_has_pebs boot_cpu_has(X86_FEATURE_PEBS)
#define cpu_has_clflush boot_cpu_has(X86_FEATURE_CLFLUSH)
-#define cpu_has_bts boot_cpu_has(X86_FEATURE_BTS)
#define cpu_has_gbpages boot_cpu_has(X86_FEATURE_GBPAGES)
#define cpu_has_arch_perfmon boot_cpu_has(X86_FEATURE_ARCH_PERFMON)
#define cpu_has_pat boot_cpu_has(X86_FEATURE_PAT)
-#define cpu_has_xmm4_1 boot_cpu_has(X86_FEATURE_XMM4_1)
-#define cpu_has_xmm4_2 boot_cpu_has(X86_FEATURE_XMM4_2)
#define cpu_has_x2apic boot_cpu_has(X86_FEATURE_X2APIC)
#define cpu_has_xsave boot_cpu_has(X86_FEATURE_XSAVE)
-#define cpu_has_xsaveopt boot_cpu_has(X86_FEATURE_XSAVEOPT)
#define cpu_has_xsaves boot_cpu_has(X86_FEATURE_XSAVES)
#define cpu_has_osxsave boot_cpu_has(X86_FEATURE_OSXSAVE)
#define cpu_has_hypervisor boot_cpu_has(X86_FEATURE_HYPERVISOR)
-#define cpu_has_pclmulqdq boot_cpu_has(X86_FEATURE_PCLMULQDQ)
-#define cpu_has_perfctr_core boot_cpu_has(X86_FEATURE_PERFCTR_CORE)
-#define cpu_has_perfctr_nb boot_cpu_has(X86_FEATURE_PERFCTR_NB)
-#define cpu_has_perfctr_l2 boot_cpu_has(X86_FEATURE_PERFCTR_L2)
-#define cpu_has_cx8 boot_cpu_has(X86_FEATURE_CX8)
-#define cpu_has_cx16 boot_cpu_has(X86_FEATURE_CX16)
-#define cpu_has_eager_fpu boot_cpu_has(X86_FEATURE_EAGER_FPU)
-#define cpu_has_topoext boot_cpu_has(X86_FEATURE_TOPOEXT)
-#define cpu_has_bpext boot_cpu_has(X86_FEATURE_BPEXT)
+/*
+ * Do not add any more of those clumsy macros - use static_cpu_has_safe() for
+ * fast paths and boot_cpu_has() otherwise!
+ */

#if __GNUC__ >= 4
extern void warn_pre_alternatives(void);
diff --git a/arch/x86/include/asm/xor_32.h b/arch/x86/include/asm/xor_32.h
index 5a08bc8bff33..ccca77dad474 100644
--- a/arch/x86/include/asm/xor_32.h
+++ b/arch/x86/include/asm/xor_32.h
@@ -553,7 +553,7 @@ do { \
if (cpu_has_xmm) { \
xor_speed(&xor_block_pIII_sse); \
xor_speed(&xor_block_sse_pf64); \
- } else if (cpu_has_mmx) { \
+ } else if (static_cpu_has_safe(X86_FEATURE_MMX)) { \
xor_speed(&xor_block_pII_mmx); \
xor_speed(&xor_block_p5_mmx); \
} else { \
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index e229640c19ab..e678ddeed030 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -304,7 +304,7 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
int cpu = smp_processor_id();

/* get information required for multi-node processors */
- if (cpu_has_topoext) {
+ if (boot_cpu_has(X86_FEATURE_TOPOEXT)) {
u32 eax, ebx, ecx, edx;

cpuid(0x8000001e, &eax, &ebx, &ecx, &edx);
@@ -922,7 +922,7 @@ static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum)

void set_dr_addr_mask(unsigned long mask, int dr)
{
- if (!cpu_has_bpext)
+ if (!boot_cpu_has(X86_FEATURE_BPEXT))
return;

switch (dr) {
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index e72fa2dab911..37830de8f60a 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1440,7 +1440,9 @@ void cpu_init(void)

printk(KERN_INFO "Initializing CPU#%d\n", cpu);

- if (cpu_feature_enabled(X86_FEATURE_VME) || cpu_has_tsc || cpu_has_de)
+ if (cpu_feature_enabled(X86_FEATURE_VME) ||
+ cpu_has_tsc ||
+ boot_cpu_has(X86_FEATURE_DE))
cr4_clear_bits(X86_CR4_VME|X86_CR4_PVI|X86_CR4_TSD|X86_CR4_DE);

load_current_idt();
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 209ac1e7d1f0..565648bc1a0a 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -445,7 +445,8 @@ static void init_intel(struct cpuinfo_x86 *c)

if (cpu_has_xmm2)
set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC);
- if (cpu_has_ds) {
+
+ if (boot_cpu_has(X86_FEATURE_DS)) {
unsigned int l1;
rdmsr(MSR_IA32_MISC_ENABLE, l1, l2);
if (!(l1 & (1<<11)))
diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c
index e38d338a6447..0b6c52388cf4 100644
--- a/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -591,7 +591,7 @@ cpuid4_cache_lookup_regs(int index, struct _cpuid4_info_regs *this_leaf)
unsigned edx;

if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) {
- if (cpu_has_topoext)
+ if (boot_cpu_has(X86_FEATURE_TOPOEXT))
cpuid_count(0x8000001d, index, &eax.full,
&ebx.full, &ecx.full, &edx);
else
@@ -637,7 +637,7 @@ static int find_num_cache_leaves(struct cpuinfo_x86 *c)
void init_amd_cacheinfo(struct cpuinfo_x86 *c)
{

- if (cpu_has_topoext) {
+ if (boot_cpu_has(X86_FEATURE_TOPOEXT)) {
num_cache_leaves = find_num_cache_leaves(c);
} else if (c->extended_cpuid_level >= 0x80000006) {
if (cpuid_edx(0x80000006) & 0xf000)
@@ -809,7 +809,7 @@ static int __cache_amd_cpumap_setup(unsigned int cpu, int index,
struct cacheinfo *this_leaf;
int i, sibling;

- if (cpu_has_topoext) {
+ if (boot_cpu_has(X86_FEATURE_TOPOEXT)) {
unsigned int apicid, nshared, first, last;

this_leaf = this_cpu_ci->info_list + index;
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index 3b533cf37c74..8f2ef910c7bf 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -349,7 +349,7 @@ static void get_fixed_ranges(mtrr_type *frs)

void mtrr_save_fixed_ranges(void *info)
{
- if (cpu_has_mtrr)
+ if (static_cpu_has_safe(X86_FEATURE_MTRR))
get_fixed_ranges(mtrr_state.fixed_ranges);
}

diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
index f891b4750f04..5c3d149ee91c 100644
--- a/arch/x86/kernel/cpu/mtrr/main.c
+++ b/arch/x86/kernel/cpu/mtrr/main.c
@@ -682,7 +682,7 @@ void __init mtrr_bp_init(void)

phys_addr = 32;

- if (cpu_has_mtrr) {
+ if (boot_cpu_has(X86_FEATURE_MTRR)) {
mtrr_if = &generic_mtrr_ops;
size_or_mask = SIZE_OR_MASK_BITS(36);
size_and_mask = 0x00f00000;
diff --git a/arch/x86/kernel/cpu/perf_event_amd.c b/arch/x86/kernel/cpu/perf_event_amd.c
index 1cee5d2d7ece..3ea177cb7366 100644
--- a/arch/x86/kernel/cpu/perf_event_amd.c
+++ b/arch/x86/kernel/cpu/perf_event_amd.c
@@ -160,7 +160,7 @@ static inline int amd_pmu_addr_offset(int index, bool eventsel)
if (offset)
return offset;

- if (!cpu_has_perfctr_core)
+ if (!boot_cpu_has(X86_FEATURE_PERFCTR_CORE))
offset = index;
else
offset = index << 1;
@@ -652,7 +652,7 @@ static __initconst const struct x86_pmu amd_pmu = {

static int __init amd_core_pmu_init(void)
{
- if (!cpu_has_perfctr_core)
+ if (!boot_cpu_has(X86_FEATURE_PERFCTR_CORE))
return 0;

switch (boot_cpu_data.x86) {
diff --git a/arch/x86/kernel/cpu/perf_event_amd_uncore.c b/arch/x86/kernel/cpu/perf_event_amd_uncore.c
index cc6cedb8f25d..49742746a6c9 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_uncore.c
@@ -523,10 +523,10 @@ static int __init amd_uncore_init(void)
if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
goto fail_nodev;

- if (!cpu_has_topoext)
+ if (!boot_cpu_has(X86_FEATURE_TOPOEXT))
goto fail_nodev;

- if (cpu_has_perfctr_nb) {
+ if (boot_cpu_has(X86_FEATURE_PERFCTR_NB)) {
amd_uncore_nb = alloc_percpu(struct amd_uncore *);
if (!amd_uncore_nb) {
ret = -ENOMEM;
@@ -540,7 +540,7 @@ static int __init amd_uncore_init(void)
ret = 0;
}

- if (cpu_has_perfctr_l2) {
+ if (boot_cpu_has(X86_FEATURE_PERFCTR_L2)) {
amd_uncore_l2 = alloc_percpu(struct amd_uncore *);
if (!amd_uncore_l2) {
ret = -ENOMEM;
@@ -583,10 +583,11 @@ fail_online:

/* amd_uncore_nb/l2 should have been freed by cleanup_cpu_online */
amd_uncore_nb = amd_uncore_l2 = NULL;
- if (cpu_has_perfctr_l2)
+
+ if (boot_cpu_has(X86_FEATURE_PERFCTR_L2))
perf_pmu_unregister(&amd_l2_pmu);
fail_l2:
- if (cpu_has_perfctr_nb)
+ if (boot_cpu_has(X86_FEATURE_PERFCTR_NB))
perf_pmu_unregister(&amd_nb_pmu);
if (amd_uncore_l2)
free_percpu(amd_uncore_l2);
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index be39b5fde4b9..22abea04731e 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -12,7 +12,7 @@
*/
static void fpu__init_cpu_ctx_switch(void)
{
- if (!cpu_has_eager_fpu)
+ if (!boot_cpu_has(X86_FEATURE_EAGER_FPU))
stts();
else
clts();
@@ -287,7 +287,7 @@ static void __init fpu__init_system_ctx_switch(void)
current_thread_info()->status = 0;

/* Auto enable eagerfpu for xsaveopt */
- if (cpu_has_xsaveopt && eagerfpu != DISABLE)
+ if (boot_cpu_has(X86_FEATURE_XSAVEOPT) && eagerfpu != DISABLE)
eagerfpu = ENABLE;

if (xfeatures_mask & XFEATURE_MASK_EAGER) {
diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index 50a3fad5b89f..2bcfb5f2bc44 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -300,6 +300,10 @@ static int arch_build_bp_info(struct perf_event *bp)
return -EINVAL;
if (bp->attr.bp_addr & (bp->attr.bp_len - 1))
return -EINVAL;
+
+ if (!boot_cpu_has(X86_FEATURE_BPEXT))
+ return -EOPNOTSUPP;
+
/*
* It's impossible to use a range breakpoint to fake out
* user vs kernel detection because bp_len - 1 can't
@@ -307,8 +311,6 @@ static int arch_build_bp_info(struct perf_event *bp)
* breakpoints, then we'll have to check for kprobe-blacklisted
* addresses anywhere in the range.
*/
- if (!cpu_has_bpext)
- return -EOPNOTSUPP;
info->mask = bp->attr.bp_len - 1;
info->len = X86_BREAKPOINT_LEN_1;
}
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index f2281e9cfdbe..24d57f77b3c1 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -304,7 +304,7 @@ do { \

static bool match_smt(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
{
- if (cpu_has_topoext) {
+ if (boot_cpu_has(X86_FEATURE_TOPOEXT)) {
int cpu1 = c->cpu_index, cpu2 = o->cpu_index;

if (c->phys_proc_id == o->phys_proc_id &&
diff --git a/arch/x86/kernel/vm86_32.c b/arch/x86/kernel/vm86_32.c
index 524619351961..483231ebbb0b 100644
--- a/arch/x86/kernel/vm86_32.c
+++ b/arch/x86/kernel/vm86_32.c
@@ -357,8 +357,10 @@ static long do_sys_vm86(struct vm86plus_struct __user *user_vm86, bool plus)
tss = &per_cpu(cpu_tss, get_cpu());
/* make room for real-mode segments */
tsk->thread.sp0 += 16;
- if (cpu_has_sep)
+
+ if (static_cpu_has_safe(X86_FEATURE_SEP))
tsk->thread.sysenter_cs = 0;
+
load_sp0(tss, &tsk->thread);
put_cpu();

diff --git a/arch/x86/mm/setup_nx.c b/arch/x86/mm/setup_nx.c
index 90555bf60aa4..92e2eacb3321 100644
--- a/arch/x86/mm/setup_nx.c
+++ b/arch/x86/mm/setup_nx.c
@@ -31,7 +31,7 @@ early_param("noexec", noexec_setup);

void x86_configure_nx(void)
{
- if (cpu_has_nx && !disable_nx)
+ if (boot_cpu_has(X86_FEATURE_NX) && !disable_nx)
__supported_pte_mask |= _PAGE_NX;
else
__supported_pte_mask &= ~_PAGE_NX;
@@ -39,7 +39,7 @@ void x86_configure_nx(void)

void __init x86_report_nx(void)
{
- if (!cpu_has_nx) {
+ if (!boot_cpu_has(X86_FEATURE_NX)) {
printk(KERN_NOTICE "Notice: NX (Execute Disable) protection "
"missing in CPU!\n");
} else {
diff --git a/drivers/char/hw_random/via-rng.c b/drivers/char/hw_random/via-rng.c
index 0c98a9d51a24..44ce80606944 100644
--- a/drivers/char/hw_random/via-rng.c
+++ b/drivers/char/hw_random/via-rng.c
@@ -140,7 +140,7 @@ static int via_rng_init(struct hwrng *rng)
* RNG configuration like it used to be the case in this
* register */
if ((c->x86 == 6) && (c->x86_model >= 0x0f)) {
- if (!cpu_has_xstore_enabled) {
+ if (!boot_cpu_has(X86_FEATURE_XSTORE_EN)) {
pr_err(PFX "can't enable hardware RNG "
"if XSTORE is not enabled\n");
return -ENODEV;
@@ -200,8 +200,9 @@ static int __init mod_init(void)
{
int err;

- if (!cpu_has_xstore)
+ if (!boot_cpu_has(X86_FEATURE_XSTORE))
return -ENODEV;
+
pr_info("VIA RNG detected\n");
err = hwrng_register(&via_rng);
if (err) {
diff --git a/drivers/crypto/padlock-aes.c b/drivers/crypto/padlock-aes.c
index da2d6777bd09..97a364694bfc 100644
--- a/drivers/crypto/padlock-aes.c
+++ b/drivers/crypto/padlock-aes.c
@@ -515,7 +515,7 @@ static int __init padlock_init(void)
if (!x86_match_cpu(padlock_cpu_id))
return -ENODEV;

- if (!cpu_has_xcrypt_enabled) {
+ if (!boot_cpu_has(X86_FEATURE_XCRYPT_EN)) {
printk(KERN_NOTICE PFX "VIA PadLock detected, but not enabled. Hmm, strange...\n");
return -ENODEV;
}
diff --git a/drivers/crypto/padlock-sha.c b/drivers/crypto/padlock-sha.c
index 4e154c9b9206..8c5f90647b7a 100644
--- a/drivers/crypto/padlock-sha.c
+++ b/drivers/crypto/padlock-sha.c
@@ -540,7 +540,7 @@ static int __init padlock_init(void)
struct shash_alg *sha1;
struct shash_alg *sha256;

- if (!x86_match_cpu(padlock_sha_ids) || !cpu_has_phe_enabled)
+ if (!x86_match_cpu(padlock_sha_ids) || !boot_cpu_has(X86_FEATURE_PHE_EN))
return -ENODEV;

/* Register the newly added algorithm module if on *
diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c
index 1fae1881648c..c12ba4516df2 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -753,7 +753,7 @@ static inline void set_irq_posting_cap(void)
* should have X86_FEATURE_CX16 support, this has been confirmed
* with Intel hardware guys.
*/
- if ( cpu_has_cx16 )
+ if (boot_cpu_has(X86_FEATURE_CX16))
intel_irq_remap_ops.capability |= 1 << IRQ_POSTING_CAP;

for_each_iommu(iommu, drhd)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 974be09e7556..42a378a4eefb 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -923,7 +923,7 @@ static int check_async_write(struct inode *inode, unsigned long bio_flags)
if (bio_flags & EXTENT_BIO_TREE_LOG)
return 0;
#ifdef CONFIG_X86
- if (cpu_has_xmm4_2)
+ if (static_cpu_has_safe(X86_FEATURE_XMM4_2))
return 0;
#endif
return 1;
--
2.3.5

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

2015-11-27 18:04:40

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros

On Fri, Nov 27, 2015 at 02:52:57PM +0100, Borislav Petkov wrote:
> commit .TEXT .DATA .BSS
> rc2+ 650055 127948 1189128
> 0a53df8a1a3a ("x86/cpufeature: Move some of the...") 649863 127948 1189384
> ed03a85e6575 ("x86/cpufeature: Cleanup get_cpu_cap()") 649831 127948 1189384
> acde56aeda14 ("x86/cpufeature: Remove unused and...") 649831 127948 1189384
>
> I'll look at doing the macro thing now, hopefully it doesn't get too ugly.

Yeah, we do save us some ~1.6K text (cf numbers above) for the price
of a bit slower feature bit testing. Don't know if it matters at all,
though:

commit .TEXT .DATA .BSS
CONFIG_X86_FAST_FEATURE_TESTS 648209 127948 1189384

and diff looks pretty simple:

---
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 4a9b9a9a1a64..ff64585ea0bf 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -350,6 +350,10 @@ config X86_FEATURE_NAMES

If in doubt, say Y.

+config X86_FAST_FEATURE_TESTS
+ bool "Fast feature tests" if EMBEDDED
+ default y
+
config X86_X2APIC
bool "Support x2apic"
depends on X86_LOCAL_APIC && X86_64 && (IRQ_REMAP || HYPERVISOR_GUEST)
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index cbe390044a7c..7ad8c9464297 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -410,7 +410,7 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
* fast paths and boot_cpu_has() otherwise!
*/

-#if __GNUC__ >= 4
+#if __GNUC__ >= 4 && defined(CONFIG_X86_FAST_FEATURE_TESTS)
extern void warn_pre_alternatives(void);
extern bool __static_cpu_has_safe(u16 bit);

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

2015-11-27 20:14:09

by Josh Triplett

[permalink] [raw]
Subject: Re: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros

On Fri, Nov 27, 2015 at 07:04:33PM +0100, Borislav Petkov wrote:
> On Fri, Nov 27, 2015 at 02:52:57PM +0100, Borislav Petkov wrote:
> > commit .TEXT .DATA .BSS
> > rc2+ 650055 127948 1189128
> > 0a53df8a1a3a ("x86/cpufeature: Move some of the...") 649863 127948 1189384
> > ed03a85e6575 ("x86/cpufeature: Cleanup get_cpu_cap()") 649831 127948 1189384
> > acde56aeda14 ("x86/cpufeature: Remove unused and...") 649831 127948 1189384
> >
> > I'll look at doing the macro thing now, hopefully it doesn't get too ugly.
>
> Yeah, we do save us some ~1.6K text (cf numbers above) for the price
> of a bit slower feature bit testing. Don't know if it matters at all,
> though:
>
> commit .TEXT .DATA .BSS
> CONFIG_X86_FAST_FEATURE_TESTS 648209 127948 1189384
>
> and diff looks pretty simple:

Given an appropriate long description for that config option, that seems
worthwhile. Something like this:

Some fast-paths in the kernel depend on the capabilities of the CPU.
Say Y here for the kernel to patch in the appropriate code at runtime
based on the capabilities of the CPU. The infrastructure for patching
code at runtime takes up some additional space; space-constrained
embedded systems may wish to say N here to produce smaller, slightly
slower code.

> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 4a9b9a9a1a64..ff64585ea0bf 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -350,6 +350,10 @@ config X86_FEATURE_NAMES
>
> If in doubt, say Y.
>
> +config X86_FAST_FEATURE_TESTS
> + bool "Fast feature tests" if EMBEDDED
> + default y
> +
> config X86_X2APIC
> bool "Support x2apic"
> depends on X86_LOCAL_APIC && X86_64 && (IRQ_REMAP || HYPERVISOR_GUEST)
> diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
> index cbe390044a7c..7ad8c9464297 100644
> --- a/arch/x86/include/asm/cpufeature.h
> +++ b/arch/x86/include/asm/cpufeature.h
> @@ -410,7 +410,7 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
> * fast paths and boot_cpu_has() otherwise!
> */
>
> -#if __GNUC__ >= 4
> +#if __GNUC__ >= 4 && defined(CONFIG_X86_FAST_FEATURE_TESTS)
> extern void warn_pre_alternatives(void);
> extern bool __static_cpu_has_safe(u16 bit);
>
> --
> Regards/Gruss,
> Boris.
>
> ECO tip #101: Trim your mails when you reply.

2015-11-27 20:23:44

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros

On Fri, Nov 27, 2015 at 12:13:55PM -0800, Josh Triplett wrote:
> Given an appropriate long description for that config option, that seems
> worthwhile. Something like this:
>
> Some fast-paths in the kernel depend on the capabilities of the CPU.
> Say Y here for the kernel to patch in the appropriate code at runtime
> based on the capabilities of the CPU. The infrastructure for patching
> code at runtime takes up some additional space; space-constrained
> embedded systems may wish to say N here to produce smaller, slightly
> slower code.

Thanks for the text, looks good and I'll use it. :)

And yes, considering the size of the patch, it is really worthwhile to
save ~1.6K kernel text that easily.

I'll do a proper patch and run it through the build tests.

Thanks.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.