2016-10-28 13:50:04

by Andrejczuk, Grzegorz

[permalink] [raw]
Subject: [PATCH v7 0/4] Enabling Ring 3 MONITOR/MWAIT feature for Knights Landing

These patches enable Intel Xeon Phi x200 feature to use MONITOR/MWAIT
instruction in ring 3 (userspace) Patches set MSR 0x140 for all logical CPUs.
Then expose it as CPU feature and introduces elf HWCAP capability for x86.
Reference:
https://software.intel.com/en-us/blogs/2016/10/06/intel-xeon-phi-product-family-x200-knl-user-mode-ring-3-monitor-and-mwait

v7:
Change order of the patches, with this code looks cleaner.
Changed the name of MSR to MSR_MISC_FEATURE_ENABLES.
Used Word 3 25th bit to expose feature.

v6:

v5:
When phir3mwait=disable is cmdline switch off r3 mwait feature
Fix typos

v4:
Wrapped the enabling code by CONFIG_X86_64
Add documentation for phir3mwait=disable cmdline switch
Move probe_ function call from early_intel_init to intel_init
Fixed commit messages

v3:
Included Daves and Thomas comments

v2:
Check MSR before wrmsrl
Shortened names
Used Word 3 for feature init_scattered_cpuid_features()
Fixed commit messages

Grzegorz Andrejczuk (4):
x86/msr: Add MSR(140H) and PHIR3MWAIT bit to msr-info.h
x86: Use HWCAP2 to expose Xeon Phi ring 3 MWAIT
x86/cpufeature: Add PHIR3MWAIT to CPU features
x86: Add enabling of the R3MWAIT during boot

Documentation/kernel-parameters.txt | 5 +++++
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/elf.h | 9 ++++++++
arch/x86/include/asm/msr-index.h | 5 +++++
arch/x86/include/uapi/asm/hwcap2.h | 7 ++++++
arch/x86/kernel/cpu/common.c | 3 +++
arch/x86/kernel/cpu/intel.c | 43 +++++++++++++++++++++++++++++++++++++
7 files changed, 73 insertions(+)
create mode 100644 arch/x86/include/uapi/asm/hwcap2.h

--
2.5.1


2016-10-28 13:50:08

by Andrejczuk, Grzegorz

[permalink] [raw]
Subject: [PATCH v7 2/4] x86: Use HWCAP2 to expose Xeon Phi ring 3 MWAIT

Add HWCAP2 for x86 and reserve its 1st bit to expose
Xeon Phi ring 3 monitor/mwait to userspace apps.

Signed-off-by: Grzegorz Andrejczuk <[email protected]>
---
arch/x86/include/asm/elf.h | 9 +++++++++
arch/x86/include/uapi/asm/hwcap2.h | 7 +++++++
arch/x86/kernel/cpu/common.c | 3 +++
3 files changed, 19 insertions(+)
create mode 100644 arch/x86/include/uapi/asm/hwcap2.h

diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index e7f155c..59703aa 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
@@ -258,6 +258,15 @@ extern int force_personality32;

#define ELF_HWCAP (boot_cpu_data.x86_capability[CPUID_1_EDX])

+extern unsigned int elf_hwcap2;
+
+/*
+ * HWCAP2 supplies mask with kernel enabled CPU features, so that
+ * the application can discover that it can safely use them.
+ * The bits are defined in uapi/asm/hwcap2.h.
+ */
+#define ELF_HWCAP2 elf_hwcap2
+
/* This yields a string that ld.so will use to load implementation
specific libraries for optimization. This is more specific in
intent than poking at uname or /proc/cpuinfo.
diff --git a/arch/x86/include/uapi/asm/hwcap2.h b/arch/x86/include/uapi/asm/hwcap2.h
new file mode 100644
index 0000000..90ef445
--- /dev/null
+++ b/arch/x86/include/uapi/asm/hwcap2.h
@@ -0,0 +1,7 @@
+#ifndef _ASM_HWCAP2_H
+#define _ASM_HWCAP2_H
+
+/* Kernel enabled Ring 3 MWAIT for Xeon Phi*/
+#define HWCAP2_PHIR3MWAIT (1 << 0)
+
+#endif
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index bcc9ccc..fdbf708 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -35,6 +35,7 @@
#include <asm/desc.h>
#include <asm/fpu/internal.h>
#include <asm/mtrr.h>
+#include <asm/hwcap2.h>
#include <linux/numa.h>
#include <asm/asm.h>
#include <asm/bugs.h>
@@ -51,6 +52,8 @@

#include "cpu.h"

+unsigned int elf_hwcap2 __read_mostly;
+
/* all of these masks are initialized in setup_cpu_local_masks() */
cpumask_var_t cpu_initialized_mask;
cpumask_var_t cpu_callout_mask;
--
2.5.1

2016-10-28 13:50:20

by Andrejczuk, Grzegorz

[permalink] [raw]
Subject: [PATCH v7 4/4] x86: Add enabling of the R3MWAIT during boot

If processor is Intel Xeon Phi x200 we enable user-level mwait feature.
Enabling this feature suppresses invalid-opcode error, when MONITOR/MWAIT
is called from ring 3.

Signed-off-by: Grzegorz Andrejczuk <[email protected]>
---
Documentation/kernel-parameters.txt | 5 +++++
arch/x86/kernel/cpu/intel.c | 43 +++++++++++++++++++++++++++++++++++++
2 files changed, 48 insertions(+)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index a4f4d69..d58915b 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -3120,6 +3120,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
pg. [PARIDE]
See Documentation/blockdev/paride.txt.

+ phir3mwait= [X86] Disable Intel Xeon Phi x200 ring 3 MONITOR/MWAIT
+ feature for all cpus.
+ Format: { disable }
+ See arch/x86/kernel/cpu/intel.c
+
pirq= [SMP,APIC] Manual mp-table setup
See Documentation/x86/i386/IO-APIC.txt.

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index fcd484d..127dfdd 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -14,6 +14,8 @@
#include <asm/bugs.h>
#include <asm/cpu.h>
#include <asm/intel-family.h>
+#include <asm/hwcap2.h>
+#include <asm/elf.h>

#ifdef CONFIG_X86_64
#include <linux/topology.h>
@@ -61,6 +63,45 @@ void check_mpx_erratum(struct cpuinfo_x86 *c)
}
}

+#ifdef CONFIG_X86_64
+static int phi_r3mwait_disabled __read_mostly;
+
+static int __init phir3mwait_disable(char *__unused)
+{
+ phi_r3mwait_disabled = 1;
+ pr_warn("x86/phir3mwait: Disabled ring 3 MWAIT for Xeon Phi");
+ return 1;
+}
+__setup("phir3mwait=disable", phir3mwait_disable);
+
+static void probe_xeon_phi_r3mwait(struct cpuinfo_x86 *c)
+{
+ u64 msr;
+
+ /*
+ * Setting ring 3 MONITOR/MWAIT for each logical CPU
+ * return when CPU is not Xeon Phi Family x200 (KnightsLanding).
+ */
+ if (c->x86 != 6 || c->x86_model != INTEL_FAM6_XEON_PHI_KNL)
+ return;
+
+ rdmsrl(MSR_MISC_FEATURE_ENABLES, msr);
+
+ if (phi_r3mwait_disabled) {
+ msr &= ~MSR_MISC_FEATURE_ENABLES_PHIR3MWAIT;
+ wrmsrl(MSR_MISC_FEATURE_ENABLES, msr);
+ } else {
+ msr |= MSR_MISC_FEATURE_ENABLES_PHIR3MWAIT;
+ wrmsrl(MSR_MISC_FEATURE_ENABLES, msr);
+ set_cpu_cap(c, X86_FEATURE_PHIR3MWAIT);
+ ELF_HWCAP2 |= HWCAP2_PHIR3MWAIT;
+ }
+}
+
+#else
+static void probe_xeon_phi_r3mwait(struct cpuinfo_x86 *__unused) {}
+#endif
+
static void early_init_intel(struct cpuinfo_x86 *c)
{
u64 misc_enable;
@@ -565,6 +606,8 @@ static void init_intel(struct cpuinfo_x86 *c)
detect_vmx_virtcap(c);

init_intel_energy_perf(c);
+
+ probe_xeon_phi_r3mwait(c);
}

#ifdef CONFIG_X86_32
--
2.5.1

2016-10-28 13:50:12

by Andrejczuk, Grzegorz

[permalink] [raw]
Subject: [PATCH v7 3/4] x86/cpufeature: Add PHIR3MWAIT to CPU features

Add Intel Xeon Phi x200 (KnightsLanding) cpu feature - ring 3 monitor/mwait

Signed-off-by: Grzegorz Andrejczuk <[email protected]>
---
arch/x86/include/asm/cpufeatures.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 92a8308..eb88eeb 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -101,6 +101,7 @@
#define X86_FEATURE_TSC_RELIABLE ( 3*32+23) /* TSC is known to be reliable */
#define X86_FEATURE_NONSTOP_TSC ( 3*32+24) /* TSC does not stop in C states */
/* free, was #define X86_FEATURE_CLFLUSH_MONITOR ( 3*32+25) * "" clflush reqd with monitor */
+#define X86_FEATURE_PHIR3MWAIT ( 3*32+25) /* Xeon Phi x200 ring 3 MONITOR/MWAIT */
#define X86_FEATURE_EXTD_APICID ( 3*32+26) /* has extended APICID (8 bits) */
#define X86_FEATURE_AMD_DCM ( 3*32+27) /* multi-node processor */
#define X86_FEATURE_APERFMPERF ( 3*32+28) /* APERFMPERF */
--
2.5.1

2016-10-28 13:50:47

by Andrejczuk, Grzegorz

[permalink] [raw]
Subject: [PATCH v7 1/4] x86/msr: Add MSR(140H) and PHIR3MWAIT bit to msr-info.h

Intel Xeon Phi x200 (codenamed Knights Landing) has MSR
MISC_FEATURE_ENABLES 0x140.

Setting 2nd bit of this register makes MONITOR and MWAIT instructions
do not cause invalid-opcode exception when called from ring different
than 0.

Hex Dec Name Scope
140H 320 MISC_FEATURE_ENABLES Thread
0 Reserved
1 if set to 1, the MONITOR and MWAIT instructions do not
cause invalid-opcode exceptions when executed with CPL > 0
or in virtual-8086 mode. If MWAIT is executed when CPL > 0
or in virtual-8086 mode, and if EAX indicates a C-state
other than C0 or C1, the instruction operates as if EAX
indicated the C-state C1.
63:2 Reserved

Signed-off-by: Grzegorz Andrejczuk <[email protected]>
---
arch/x86/include/asm/msr-index.h | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 56f4c66..0fc220d 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -540,6 +540,11 @@
#define MSR_IA32_MISC_ENABLE_IP_PREF_DISABLE_BIT 39
#define MSR_IA32_MISC_ENABLE_IP_PREF_DISABLE (1ULL << MSR_IA32_MISC_ENABLE_IP_PREF_DISABLE_BIT)

+/* Intel Xeon Phi x200 ring 3 MONITOR/MWAIT */
+#define MSR_MISC_FEATURE_ENABLES 0x00000140
+#define MSR_MISC_FEATURE_ENABLES_PHIR3MWAIT_BIT 1
+#define MSR_MISC_FEATURE_ENABLES_PHIR3MWAIT (1ULL << MSR_MISC_FEATURE_ENABLES_PHIR3MWAIT_BIT)
+
#define MSR_IA32_TSC_DEADLINE 0x000006E0

/* P4/Xeon+ specific */
--
2.5.1

2016-10-31 01:17:34

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v7 2/4] x86: Use HWCAP2 to expose Xeon Phi ring 3 MWAIT

On Fri, 28 Oct 2016, Grzegorz Andrejczuk wrote:

The subject subsystem wants to be 'x86/elf:'

And please remove that Xeon PHI reference. See below.

> Add HWCAP2 for x86 and reserve its 1st bit to expose

s/1st bit/bit 0/ please

> +
> +/* Kernel enabled Ring 3 MWAIT for Xeon Phi*/
> +#define HWCAP2_PHIR3MWAIT (1 << 0)

Can we please make that

/* MWAIT/MONITOR are available in ring 3 */
#define HWCAP2_RING3MWAIT (1 << 0)

Simply because this feature might become available on other platforms and
then the PHI thing does not make any sense. We do not care on which CPU
model this works, we care about the fact that it works.

Thanks,

tglx

2016-10-31 01:17:55

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v7 3/4] x86/cpufeature: Add PHIR3MWAIT to CPU features

On Fri, 28 Oct 2016, Grzegorz Andrejczuk wrote:

> Add Intel Xeon Phi x200 (KnightsLanding) cpu feature - ring 3 monitor/mwait
>
> Signed-off-by: Grzegorz Andrejczuk <[email protected]>
> ---
> arch/x86/include/asm/cpufeatures.h | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index 92a8308..eb88eeb 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -101,6 +101,7 @@
> #define X86_FEATURE_TSC_RELIABLE ( 3*32+23) /* TSC is known to be reliable */
> #define X86_FEATURE_NONSTOP_TSC ( 3*32+24) /* TSC does not stop in C states */
> /* free, was #define X86_FEATURE_CLFLUSH_MONITOR ( 3*32+25) * "" clflush reqd with monitor */

Bah. Do you really think that this comment is still useful? Can you please
get your act together and switch on your brain instead of mindlessly
slapping stuff into the code?

> +#define X86_FEATURE_PHIR3MWAIT ( 3*32+25) /* Xeon Phi x200 ring 3 MONITOR/MWAIT */

Thanks,

tglx

2016-10-31 01:18:06

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v7 4/4] x86: Add enabling of the R3MWAIT during boot

On Fri, 28 Oct 2016, Grzegorz Andrejczuk wrote:

> Subject: x86: Add enabling of the R3MWAIT during boot

This is not a proper sentence and aside of that it's wrong, because the
feature handling (enable or disable) is not only done during boot. It's
also done on each cpu hotplug operation.

x86/cpufeatures: Handle RING3MWAIT on Xeon Phi models

would be a proper changelog.

> If processor is Intel Xeon Phi x200 we enable user-level mwait feature.
> Enabling this feature suppresses invalid-opcode error, when MONITOR/MWAIT
> is called from ring 3.

The changelog is missing an explanation that this feature is not
discoverable and is hardcoded enabled depending on cpu vendor/family/model/

> + phir3mwait= [X86] Disable Intel Xeon Phi x200 ring 3 MONITOR/MWAIT
> + feature for all cpus.
> + Format: { disable }
> + See arch/x86/kernel/cpu/intel.c

The feature detection/enabling might move to some other file and then this
becomes stale. Add a proper documentation file.

> +#ifdef CONFIG_X86_64
> +static int phi_r3mwait_disabled __read_mostly;
> +
> +static int __init phir3mwait_disable(char *__unused)
> +{
> + phi_r3mwait_disabled = 1;
> + pr_warn("x86/phir3mwait: Disabled ring 3 MWAIT for Xeon Phi");

Are you actually reading my reviews at all? If you disagree, then please do
so in a reply to my review, but silently ignoring it is not working.

> + return 1;
> +}
> +__setup("phir3mwait=disable", phir3mwait_disable);
> +
> +static void probe_xeon_phi_r3mwait(struct cpuinfo_x86 *c)
> +{
> + u64 msr;
> +
> + /*
> + * Setting ring 3 MONITOR/MWAIT for each logical CPU
> + * return when CPU is not Xeon Phi Family x200 (KnightsLanding).

Comments which tell what the code does are pointless, because that can be
seen from the code. The information which needs to be in a comment is WHY
this is done, i.e. non discoverability wants to be documented.

> + */
> + if (c->x86 != 6 || c->x86_model != INTEL_FAM6_XEON_PHI_KNL)
> + return;
> +
> + rdmsrl(MSR_MISC_FEATURE_ENABLES, msr);

We can avoid that whole rdmsrl/wrmsrl dance and simply use
msr_set/clear_bit() below.

> +
> + if (phi_r3mwait_disabled) {
> + msr &= ~MSR_MISC_FEATURE_ENABLES_PHIR3MWAIT;
> + wrmsrl(MSR_MISC_FEATURE_ENABLES, msr);
> + } else {
> + msr |= MSR_MISC_FEATURE_ENABLES_PHIR3MWAIT;
> + wrmsrl(MSR_MISC_FEATURE_ENABLES, msr);
> + set_cpu_cap(c, X86_FEATURE_PHIR3MWAIT);
> + ELF_HWCAP2 |= HWCAP2_PHIR3MWAIT;
> + }
> +}

Thanks,

tglx

2016-10-31 01:18:14

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v7 1/4] x86/msr: Add MSR(140H) and PHIR3MWAIT bit to msr-info.h

On Fri, 28 Oct 2016, Grzegorz Andrejczuk wrote:

> Subject: x86/msr: Add MSR(140H) and PHIR3MWAIT bit to msr-info.h

What kind of information is this: MSR(140h)? None at all. That MSR has a
name. So why not using it?

x86/msr: Add MSR_MISC_FEATURE_ENABLES and PHIR3MWAIT bit

Hmm? You might have noticed that I removed the 'to msr-info.h' part. I did
this because it's useless information. We can see that from the patch.

> Intel Xeon Phi x200 (codenamed Knights Landing) has MSR
> MISC_FEATURE_ENABLES 0x140.

And that tells us what?

> Setting 2nd bit of this register makes MONITOR and MWAIT instructions
> do not cause invalid-opcode exception when called from ring different
> than 0.

Can you please be a bit more careful with your changelogs? I give you an
example for a proper one:

Intel Xeon Phi x200 (codenamed Knights Landing) allows to enable MONITOR
and MWAIT instructions outside of ring 0.

The feature is controlled by MSR MISC_FEATURE_ENABLES (0x144). Setting
bit 0 of this register enables it, so MONITOR and MWAIT instructions do
not cause invalid-opcode exceptions when invoked outside of ring 0.

The feature MSR is not yet documented in the SDM. Here is the relevant
documentation:

Can you spot the difference?

Hint: Copy and paste are dangerous

> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -540,6 +540,11 @@
> #define MSR_IA32_MISC_ENABLE_IP_PREF_DISABLE_BIT 39
> #define MSR_IA32_MISC_ENABLE_IP_PREF_DISABLE (1ULL << MSR_IA32_MISC_ENABLE_IP_PREF_DISABLE_BIT)
>
> +/* Intel Xeon Phi x200 ring 3 MONITOR/MWAIT */
> +#define MSR_MISC_FEATURE_ENABLES 0x00000140
> +#define MSR_MISC_FEATURE_ENABLES_PHIR3MWAIT_BIT 1
> +#define MSR_MISC_FEATURE_ENABLES_PHIR3MWAIT (1ULL << MSR_MISC_FEATURE_ENABLES_PHIR3MWAIT_BIT)
> +

See comment on patch 4/4 why this bit mask is not required.

Thanks,

tglx

2016-12-03 20:27:55

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH v7 0/4] Enabling Ring 3 MONITOR/MWAIT feature for Knights Landing

On Fri 2016-10-28 15:49:51, Grzegorz Andrejczuk wrote:
> These patches enable Intel Xeon Phi x200 feature to use MONITOR/MWAIT
> instruction in ring 3 (userspace) Patches set MSR 0x140 for all
logical CPUs.

> Then expose it as CPU feature and introduces elf HWCAP capability
> for x86.

Elf hardware capability? Why do that? Normally CPU features go to
/proc/cpuinfo...

Plus... Can you elaborate on security effects of this? Does monitor
allow monitoring memory the current process can not write to? Is it
good idea to let userland process watch for NMIs?

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Attachments:
(No filename) (711.00 B)
signature.asc (181.00 B)
Digital signature
Download all attachments