2018-01-05 16:07:45

by Tom Lendacky

[permalink] [raw]
Subject: [PATCH v1 0/3] x86/cpu/AMD: Make LFENCE a serializing instruction on AMD

To aid in speculation control, the LFENCE instruction will be turned into
a serializing instruction. There is less performance impact using LFENCE
in this way compared to MFENCE.

With LFENCE now being a serializing instruction, it can be used in
rdtsc_ordered() in place of MFENCE_RDTSC. The other two patches in this
series make this change and remove the MFENCE_RDTSC feature.

The following patches are included in this series:
- Make LFENCE a serializing instruction on AMD
- Change over to LFENCE_RDTSC from MFENCE_RDTSC on AMD
- Remove the MFENCE_RDTSC feature

This patch series is based on tip:x86/pti.

---

Tom Lendacky (3):
x86/cpu/AMD: Make LFENCE a serializing instruction
x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC
x86/msr: Remove now unused definition of MFENCE_RDTSC feature


arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/include/asm/msr-index.h | 2 ++
arch/x86/include/asm/msr.h | 3 +--
arch/x86/kernel/cpu/amd.c | 13 +++++++++++--
4 files changed, 15 insertions(+), 5 deletions(-)

--
Tom Lendacky


2018-01-05 16:07:57

by Tom Lendacky

[permalink] [raw]
Subject: [PATCH v1 1/3] x86/cpu/AMD: Make LFENCE a serializing instruction

To aid in speculation control, make LFENCE a serializing instruction.
This is done by setting bit 1 of MSR 0xc0011029 (DE_CFG). Some families
that support LFENCE do not have this MSR. For these families, the LFENCE
instruction is already serializing.

Signed-off-by: Tom Lendacky <[email protected]>
---
arch/x86/include/asm/msr-index.h | 2 ++
arch/x86/kernel/cpu/amd.c | 9 +++++++++
2 files changed, 11 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index ab02261..1e7d710 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -352,6 +352,8 @@
#define FAM10H_MMIO_CONF_BASE_MASK 0xfffffffULL
#define FAM10H_MMIO_CONF_BASE_SHIFT 20
#define MSR_FAM10H_NODE_ID 0xc001100c
+#define MSR_F10H_DECFG 0xc0011029
+#define MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT 1

/* K8 MSRs */
#define MSR_K8_TOP_MEM1 0xc001001a
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index bcb75dc..fbd439e 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -829,6 +829,15 @@ static void init_amd(struct cpuinfo_x86 *c)
set_cpu_cap(c, X86_FEATURE_K8);

if (cpu_has(c, X86_FEATURE_XMM2)) {
+ /*
+ * Use LFENCE for execution serialization. On families which
+ * don't have that MSR, LFENCE is already serializing.
+ * msr_set_bit() uses the safe accessors, too, even if the MSR
+ * is not present.
+ */
+ msr_set_bit(MSR_F10H_DECFG,
+ MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT);
+
/* MFENCE stops RDTSC speculation */
set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC);
}

2018-01-05 16:08:05

by Tom Lendacky

[permalink] [raw]
Subject: [PATCH v1 2/3] x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC

With LFENCE now a serializing instruction, set the LFENCE_RDTSC
feature since the LFENCE instruction has less overhead than the
MFENCE instruction.

Signed-off-by: Tom Lendacky <[email protected]>
---
arch/x86/kernel/cpu/amd.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index fbd439e..b221fe5 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -838,8 +838,8 @@ static void init_amd(struct cpuinfo_x86 *c)
msr_set_bit(MSR_F10H_DECFG,
MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT);

- /* MFENCE stops RDTSC speculation */
- set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC);
+ /* A serializing LFENCE stops RDTSC speculation */
+ set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC);
}

/*

2018-01-05 16:08:21

by Tom Lendacky

[permalink] [raw]
Subject: [PATCH v1 3/3] x86/msr: Remove now unused definition of MFENCE_RDTSC feature

With the switch to using LFENCE_RDTSC on AMD platforms there is no longer
a need for the MFENCE_RDTSC feature. Remove its usage and definition.

Signed-off-by: Tom Lendacky <[email protected]>
---
arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/include/asm/msr.h | 3 +--
2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 07cdd17..14ad778 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -96,7 +96,7 @@
#define X86_FEATURE_SYSCALL32 ( 3*32+14) /* "" syscall in IA32 userspace */
#define X86_FEATURE_SYSENTER32 ( 3*32+15) /* "" sysenter in IA32 userspace */
#define X86_FEATURE_REP_GOOD ( 3*32+16) /* REP microcode works well */
-#define X86_FEATURE_MFENCE_RDTSC ( 3*32+17) /* "" MFENCE synchronizes RDTSC */
+/* free, was: #define X86_FEATURE_MFENCE_RDTSC ( 3*32+17) "" MFENCE synchronizes RDTSC */
#define X86_FEATURE_LFENCE_RDTSC ( 3*32+18) /* "" LFENCE synchronizes RDTSC */
#define X86_FEATURE_ACC_POWER ( 3*32+19) /* AMD Accumulated Power Mechanism */
#define X86_FEATURE_NOPL ( 3*32+20) /* The NOPL (0F 1F) instructions */
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index 07962f5..8d8d7ae 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -214,8 +214,7 @@ static __always_inline unsigned long long rdtsc_ordered(void)
* that some other imaginary CPU is updating continuously with a
* time stamp.
*/
- alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC,
- "lfence", X86_FEATURE_LFENCE_RDTSC);
+ alternative("", "lfence", X86_FEATURE_LFENCE_RDTSC);
return rdtsc();
}


2018-01-05 16:35:26

by Brian Gerst

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] x86/cpu/AMD: Make LFENCE a serializing instruction

On Fri, Jan 5, 2018 at 11:07 AM, Tom Lendacky <[email protected]> wrote:
> To aid in speculation control, make LFENCE a serializing instruction.
> This is done by setting bit 1 of MSR 0xc0011029 (DE_CFG). Some families
> that support LFENCE do not have this MSR. For these families, the LFENCE
> instruction is already serializing.

Does this require a microcode update?

--
Brian Gerst

2018-01-05 16:36:52

by Tom Lendacky

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] x86/cpu/AMD: Make LFENCE a serializing instruction

On 1/5/2018 10:35 AM, Brian Gerst wrote:
> On Fri, Jan 5, 2018 at 11:07 AM, Tom Lendacky <[email protected]> wrote:
>> To aid in speculation control, make LFENCE a serializing instruction.
>> This is done by setting bit 1 of MSR 0xc0011029 (DE_CFG). Some families
>> that support LFENCE do not have this MSR. For these families, the LFENCE
>> instruction is already serializing.
>
> Does this require a microcode update?

No, it doesn't.

Thanks,
Tom

>
> --
> Brian Gerst
>

2018-01-05 16:56:08

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v1 0/3] x86/cpu/AMD: Make LFENCE a serializing instruction on AMD

On Fri, Jan 05, 2018 at 10:07:36AM -0600, Tom Lendacky wrote:
> To aid in speculation control, the LFENCE instruction will be turned into
> a serializing instruction. There is less performance impact using LFENCE
> in this way compared to MFENCE.
>
> With LFENCE now being a serializing instruction, it can be used in
> rdtsc_ordered() in place of MFENCE_RDTSC. The other two patches in this
> series make this change and remove the MFENCE_RDTSC feature.
>
> The following patches are included in this series:
> - Make LFENCE a serializing instruction on AMD
> - Change over to LFENCE_RDTSC from MFENCE_RDTSC on AMD
> - Remove the MFENCE_RDTSC feature
>
> This patch series is based on tip:x86/pti.
>
> ---
>
> Tom Lendacky (3):
> x86/cpu/AMD: Make LFENCE a serializing instruction
> x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC
> x86/msr: Remove now unused definition of MFENCE_RDTSC feature

All three:

Reviewed-by: Borislav Petkov <[email protected]>

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

Subject: [tip:x86/pti] x86/cpu/AMD: Make LFENCE a serializing instruction

Commit-ID: 0592b0bce1694957fed178fc52f4b11576714b07
Gitweb: https://git.kernel.org/tip/0592b0bce1694957fed178fc52f4b11576714b07
Author: Tom Lendacky <[email protected]>
AuthorDate: Fri, 5 Jan 2018 10:07:46 -0600
Committer: Thomas Gleixner <[email protected]>
CommitDate: Sat, 6 Jan 2018 21:57:40 +0100

x86/cpu/AMD: Make LFENCE a serializing instruction

To aid in speculation control, make LFENCE a serializing instruction.
This is done by setting bit 1 of MSR 0xc0011029 (DE_CFG). Some families
that support LFENCE do not have this MSR. For these families, the LFENCE
instruction is already serializing.

Signed-off-by: Tom Lendacky <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Borislav Petkov <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: David Woodhouse <[email protected]>
Cc: Paul Turner <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

---
arch/x86/include/asm/msr-index.h | 2 ++
arch/x86/kernel/cpu/amd.c | 9 +++++++++
2 files changed, 11 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index ab02261..1e7d710 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -352,6 +352,8 @@
#define FAM10H_MMIO_CONF_BASE_MASK 0xfffffffULL
#define FAM10H_MMIO_CONF_BASE_SHIFT 20
#define MSR_FAM10H_NODE_ID 0xc001100c
+#define MSR_F10H_DECFG 0xc0011029
+#define MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT 1

/* K8 MSRs */
#define MSR_K8_TOP_MEM1 0xc001001a
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index bcb75dc..fbd439e 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -829,6 +829,15 @@ static void init_amd(struct cpuinfo_x86 *c)
set_cpu_cap(c, X86_FEATURE_K8);

if (cpu_has(c, X86_FEATURE_XMM2)) {
+ /*
+ * Use LFENCE for execution serialization. On families which
+ * don't have that MSR, LFENCE is already serializing.
+ * msr_set_bit() uses the safe accessors, too, even if the MSR
+ * is not present.
+ */
+ msr_set_bit(MSR_F10H_DECFG,
+ MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT);
+
/* MFENCE stops RDTSC speculation */
set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC);
}

Subject: [tip:x86/pti] x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC

Commit-ID: 0bf17c102177d5da9363bf8b1e4704b9996d5079
Gitweb: https://git.kernel.org/tip/0bf17c102177d5da9363bf8b1e4704b9996d5079
Author: Tom Lendacky <[email protected]>
AuthorDate: Fri, 5 Jan 2018 10:07:56 -0600
Committer: Thomas Gleixner <[email protected]>
CommitDate: Sat, 6 Jan 2018 21:57:40 +0100

x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC

With LFENCE now a serializing instruction, set the LFENCE_RDTSC
feature since the LFENCE instruction has less overhead than the
MFENCE instruction.

Signed-off-by: Tom Lendacky <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Borislav Petkov <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: David Woodhouse <[email protected]>
Cc: Paul Turner <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

---
arch/x86/kernel/cpu/amd.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index fbd439e..b221fe5 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -838,8 +838,8 @@ static void init_amd(struct cpuinfo_x86 *c)
msr_set_bit(MSR_F10H_DECFG,
MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT);

- /* MFENCE stops RDTSC speculation */
- set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC);
+ /* A serializing LFENCE stops RDTSC speculation */
+ set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC);
}

/*

Subject: [tip:x86/pti] x86/msr: Remove now unused definition of MFENCE_RDTSC feature

Commit-ID: eeab3eee2fa4a8e8eb52e2abf034f14f1d010e0d
Gitweb: https://git.kernel.org/tip/eeab3eee2fa4a8e8eb52e2abf034f14f1d010e0d
Author: Tom Lendacky <[email protected]>
AuthorDate: Fri, 5 Jan 2018 10:08:05 -0600
Committer: Thomas Gleixner <[email protected]>
CommitDate: Sat, 6 Jan 2018 21:57:41 +0100

x86/msr: Remove now unused definition of MFENCE_RDTSC feature

With the switch to using LFENCE_RDTSC on AMD platforms there is no longer
a need for the MFENCE_RDTSC feature. Remove its usage and definition.

Signed-off-by: Tom Lendacky <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Borislav Petkov <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: David Woodhouse <[email protected]>
Cc: Paul Turner <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

---
arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/include/asm/msr.h | 3 +--
2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 1641c2f..511d909 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -96,7 +96,7 @@
#define X86_FEATURE_SYSCALL32 ( 3*32+14) /* "" syscall in IA32 userspace */
#define X86_FEATURE_SYSENTER32 ( 3*32+15) /* "" sysenter in IA32 userspace */
#define X86_FEATURE_REP_GOOD ( 3*32+16) /* REP microcode works well */
-#define X86_FEATURE_MFENCE_RDTSC ( 3*32+17) /* "" MFENCE synchronizes RDTSC */
+/* free, was: #define X86_FEATURE_MFENCE_RDTSC ( 3*32+17) "" MFENCE synchronizes RDTSC */
#define X86_FEATURE_LFENCE_RDTSC ( 3*32+18) /* "" LFENCE synchronizes RDTSC */
#define X86_FEATURE_ACC_POWER ( 3*32+19) /* AMD Accumulated Power Mechanism */
#define X86_FEATURE_NOPL ( 3*32+20) /* The NOPL (0F 1F) instructions */
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index 07962f5..8d8d7ae 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -214,8 +214,7 @@ static __always_inline unsigned long long rdtsc_ordered(void)
* that some other imaginary CPU is updating continuously with a
* time stamp.
*/
- alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC,
- "lfence", X86_FEATURE_LFENCE_RDTSC);
+ alternative("", "lfence", X86_FEATURE_LFENCE_RDTSC);
return rdtsc();
}


2018-01-08 10:09:29

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [tip:x86/pti] x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC

On Sat, 6 Jan 2018, tip-bot for Tom Lendacky wrote:

> Commit-ID: 0bf17c102177d5da9363bf8b1e4704b9996d5079
> Gitweb: https://git.kernel.org/tip/0bf17c102177d5da9363bf8b1e4704b9996d5079
> Author: Tom Lendacky <[email protected]>
> AuthorDate: Fri, 5 Jan 2018 10:07:56 -0600
> Committer: Thomas Gleixner <[email protected]>
> CommitDate: Sat, 6 Jan 2018 21:57:40 +0100
>
> x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC
>
> With LFENCE now a serializing instruction, set the LFENCE_RDTSC
> feature since the LFENCE instruction has less overhead than the
> MFENCE instruction.

Second thoughts on that. As pointed out by someone in one of the insane
long threads:

What happens if the kernel runs as a guest and

- the hypervisor did not set the LFENCE to serializing on the host

- the hypervisor does not allow writing MSR_AMD64_DE_CFG

That would bring the guest into a pretty bad state or am I missing
something essential here?

I'm dropping these patches until this question is answered.

Thanks,

tglx

2018-01-08 10:24:16

by Woodhouse, David

[permalink] [raw]
Subject: Re: [tip:x86/pti] x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC

On Mon, 2018-01-08 at 11:08 +0100, Thomas Gleixner wrote:
> I'm dropping these patches until this question is answered.

I've rebased my retpoline tree on top of tip/x86/pti from before those
patches (from my BUG_SPECTRE_Vx patch).


Attachments:
smime.p7s (5.09 kB)

2018-01-08 10:25:59

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [tip:x86/pti] x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC

On Mon, 8 Jan 2018, Woodhouse, David wrote:

> On Mon, 2018-01-08 at 11:08 +0100, Thomas Gleixner wrote:
> > I'm dropping these patches until this question is answered.
>
> I've rebased my retpoline tree on top of tip/x86/pti from before those
> patches (from my BUG_SPECTRE_Vx patch).

That's not a problem. They still apply

2018-01-08 10:40:52

by Andrew Cooper

[permalink] [raw]
Subject: Re: [tip:x86/pti] x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC

On 08/01/18 10:08, Thomas Gleixner wrote:
> On Sat, 6 Jan 2018, tip-bot for Tom Lendacky wrote:
>
>> Commit-ID: 0bf17c102177d5da9363bf8b1e4704b9996d5079
>> Gitweb: https://git.kernel.org/tip/0bf17c102177d5da9363bf8b1e4704b9996d5079
>> Author: Tom Lendacky <[email protected]>
>> AuthorDate: Fri, 5 Jan 2018 10:07:56 -0600
>> Committer: Thomas Gleixner <[email protected]>
>> CommitDate: Sat, 6 Jan 2018 21:57:40 +0100
>>
>> x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC
>>
>> With LFENCE now a serializing instruction, set the LFENCE_RDTSC
>> feature since the LFENCE instruction has less overhead than the
>> MFENCE instruction.
> Second thoughts on that. As pointed out by someone in one of the insane
> long threads:
>
> What happens if the kernel runs as a guest and
>
> - the hypervisor did not set the LFENCE to serializing on the host
>
> - the hypervisor does not allow writing MSR_AMD64_DE_CFG
>
> That would bring the guest into a pretty bad state or am I missing
> something essential here?

What I did in Xen was to attempt to set it, then read it back and see. 
If LFENCE still isn't serialising, using repoline is the only available
mitigation.

My understanding from the folk at AMD is that retpoline is safe to use,
but has higher overhead than the LFENCE approach.

~Andrew

2018-01-08 11:11:16

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [tip:x86/pti] x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC

On Mon, 8 Jan 2018, Andrew Cooper wrote:

> On 08/01/18 10:08, Thomas Gleixner wrote:
> > On Sat, 6 Jan 2018, tip-bot for Tom Lendacky wrote:
> >
> >> Commit-ID: 0bf17c102177d5da9363bf8b1e4704b9996d5079
> >> Gitweb: https://git.kernel.org/tip/0bf17c102177d5da9363bf8b1e4704b9996d5079
> >> Author: Tom Lendacky <[email protected]>
> >> AuthorDate: Fri, 5 Jan 2018 10:07:56 -0600
> >> Committer: Thomas Gleixner <[email protected]>
> >> CommitDate: Sat, 6 Jan 2018 21:57:40 +0100
> >>
> >> x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC
> >>
> >> With LFENCE now a serializing instruction, set the LFENCE_RDTSC
> >> feature since the LFENCE instruction has less overhead than the
> >> MFENCE instruction.
> > Second thoughts on that. As pointed out by someone in one of the insane
> > long threads:
> >
> > What happens if the kernel runs as a guest and
> >
> > - the hypervisor did not set the LFENCE to serializing on the host
> >
> > - the hypervisor does not allow writing MSR_AMD64_DE_CFG
> >
> > That would bring the guest into a pretty bad state or am I missing
> > something essential here?
>
> What I did in Xen was to attempt to set it, then read it back and see. 
> If LFENCE still isn't serialising, using repoline is the only available
> mitigation.
>
> My understanding from the folk at AMD is that retpoline is safe to use,
> but has higher overhead than the LFENCE approach.

That still does not help vs. rdtsc_ordered() and LFENCE_RDTSC ...

Thanks,

tglx

2018-01-08 14:47:26

by Tom Lendacky

[permalink] [raw]
Subject: Re: [tip:x86/pti] x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC

On 1/8/2018 5:10 AM, Thomas Gleixner wrote:
> On Mon, 8 Jan 2018, Andrew Cooper wrote:
>
>> On 08/01/18 10:08, Thomas Gleixner wrote:
>>> On Sat, 6 Jan 2018, tip-bot for Tom Lendacky wrote:
>>>
>>>> Commit-ID: 0bf17c102177d5da9363bf8b1e4704b9996d5079
>>>> Gitweb: https://git.kernel.org/tip/0bf17c102177d5da9363bf8b1e4704b9996d5079
>>>> Author: Tom Lendacky <[email protected]>
>>>> AuthorDate: Fri, 5 Jan 2018 10:07:56 -0600
>>>> Committer: Thomas Gleixner <[email protected]>
>>>> CommitDate: Sat, 6 Jan 2018 21:57:40 +0100
>>>>
>>>> x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC
>>>>
>>>> With LFENCE now a serializing instruction, set the LFENCE_RDTSC
>>>> feature since the LFENCE instruction has less overhead than the
>>>> MFENCE instruction.
>>> Second thoughts on that. As pointed out by someone in one of the insane
>>> long threads:
>>>
>>> What happens if the kernel runs as a guest and
>>>
>>> - the hypervisor did not set the LFENCE to serializing on the host
>>>
>>> - the hypervisor does not allow writing MSR_AMD64_DE_CFG
>>>
>>> That would bring the guest into a pretty bad state or am I missing
>>> something essential here?
>>
>> What I did in Xen was to attempt to set it, then read it back and see. 
>> If LFENCE still isn't serialising, using repoline is the only available
>> mitigation.
>>
>> My understanding from the folk at AMD is that retpoline is safe to use,
>> but has higher overhead than the LFENCE approach.

Correct, the retpoline will work, it just takes more cycles.

>
> That still does not help vs. rdtsc_ordered() and LFENCE_RDTSC ...

Ok, I can add the read-back check before setting the feature flag(s).

But... what about the case where the guest is a different family than
hypervisor? If we're on, say, a Fam15h hypervisor but the guest is started
as a Fam0fh guest where the MSR doesn't exist and LFENCE is supposed to be
serialized? I'll have to do a rdmsr_safe() and only set the flag(s) if I
can successfully read the MSR back and validate the bit.

Thanks,
Tom

>
> Thanks,
>
> tglx
>

2018-01-08 15:03:00

by David Woodhouse

[permalink] [raw]
Subject: Re: [tip:x86/pti] x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC


> Ok, I can add the read-back check before setting the feature flag(s).
>
> But... what about the case where the guest is a different family than
> hypervisor? If we're on, say, a Fam15h hypervisor but the guest is started
> as a Fam0fh guest where the MSR doesn't exist and LFENCE is supposed to be
> serialized? I'll have to do a rdmsr_safe() and only set the flag(s) if I
> can successfully read the MSR back and validate the bit.

At some point, if the hypervisor is lying to you about what the CPU is,
and you can't check empirically, you have to admit defeat :)


--
dwmw2

2018-01-08 15:16:15

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [tip:x86/pti] x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC

On Mon, 8 Jan 2018, Tom Lendacky wrote:
> On 1/8/2018 5:10 AM, Thomas Gleixner wrote:
> >>> Second thoughts on that. As pointed out by someone in one of the insane
> >>> long threads:
> >>>
> >>> What happens if the kernel runs as a guest and
> >>>
> >>> - the hypervisor did not set the LFENCE to serializing on the host
> >>>
> >>> - the hypervisor does not allow writing MSR_AMD64_DE_CFG
> >>>
> >>> That would bring the guest into a pretty bad state or am I missing
> >>> something essential here?
> >>
> >> What I did in Xen was to attempt to set it, then read it back and see. 
> >> If LFENCE still isn't serialising, using repoline is the only available
> >> mitigation.
> >>
> >> My understanding from the folk at AMD is that retpoline is safe to use,
> >> but has higher overhead than the LFENCE approach.
>
> Correct, the retpoline will work, it just takes more cycles.
>
> >
> > That still does not help vs. rdtsc_ordered() and LFENCE_RDTSC ...
>
> Ok, I can add the read-back check before setting the feature flag(s).
>
> But... what about the case where the guest is a different family than
> hypervisor? If we're on, say, a Fam15h hypervisor but the guest is started
> as a Fam0fh guest where the MSR doesn't exist and LFENCE is supposed to be
> serialized? I'll have to do a rdmsr_safe() and only set the flag(s) if I
> can successfully read the MSR back and validate the bit.

But that still does not make this patch correct and neither the next one.

If you cannot set the flag and you cannot prove that you run on bare metal,
then you still need the MFENCE_RDTSC variant as you have no idea what the
underlying hypervisor is and how it has the LFENCE configured.

Thanks,

tglx

2018-01-08 16:33:31

by Andrew Cooper

[permalink] [raw]
Subject: Re: [tip:x86/pti] x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC

On 08/01/18 14:47, Tom Lendacky wrote:
> On 1/8/2018 5:10 AM, Thomas Gleixner wrote:
>> On Mon, 8 Jan 2018, Andrew Cooper wrote:
>>
>>> On 08/01/18 10:08, Thomas Gleixner wrote:
>>>> On Sat, 6 Jan 2018, tip-bot for Tom Lendacky wrote:
>>>>
>>>>> Commit-ID: 0bf17c102177d5da9363bf8b1e4704b9996d5079
>>>>> Gitweb: https://git.kernel.org/tip/0bf17c102177d5da9363bf8b1e4704b9996d5079
>>>>> Author: Tom Lendacky <[email protected]>
>>>>> AuthorDate: Fri, 5 Jan 2018 10:07:56 -0600
>>>>> Committer: Thomas Gleixner <[email protected]>
>>>>> CommitDate: Sat, 6 Jan 2018 21:57:40 +0100
>>>>>
>>>>> x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC
>>>>>
>>>>> With LFENCE now a serializing instruction, set the LFENCE_RDTSC
>>>>> feature since the LFENCE instruction has less overhead than the
>>>>> MFENCE instruction.
>>>> Second thoughts on that. As pointed out by someone in one of the insane
>>>> long threads:
>>>>
>>>> What happens if the kernel runs as a guest and
>>>>
>>>> - the hypervisor did not set the LFENCE to serializing on the host
>>>>
>>>> - the hypervisor does not allow writing MSR_AMD64_DE_CFG
>>>>
>>>> That would bring the guest into a pretty bad state or am I missing
>>>> something essential here?
>>> What I did in Xen was to attempt to set it, then read it back and see. 
>>> If LFENCE still isn't serialising, using repoline is the only available
>>> mitigation.
>>>
>>> My understanding from the folk at AMD is that retpoline is safe to use,
>>> but has higher overhead than the LFENCE approach.
> Correct, the retpoline will work, it just takes more cycles.
>
>> That still does not help vs. rdtsc_ordered() and LFENCE_RDTSC ...
> Ok, I can add the read-back check before setting the feature flag(s).
>
> But... what about the case where the guest is a different family than
> hypervisor? If we're on, say, a Fam15h hypervisor but the guest is started
> as a Fam0fh guest where the MSR doesn't exist and LFENCE is supposed to be
> serialized? I'll have to do a rdmsr_safe() and only set the flag(s) if I
> can successfully read the MSR back and validate the bit.

If your hypervisor is lying to you about the primary family, then all
bets are off.  I don't expect there will be any production systems doing
this.

The user can get to keep both pieces if they've decided that this was a
good thing to try.

~Andrew

2018-01-08 16:48:14

by Dr. David Alan Gilbert

[permalink] [raw]
Subject: Re: [tip:x86/pti] x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC

* Andrew Cooper ([email protected]) wrote:
> On 08/01/18 14:47, Tom Lendacky wrote:
> > On 1/8/2018 5:10 AM, Thomas Gleixner wrote:
> >> On Mon, 8 Jan 2018, Andrew Cooper wrote:
> >>
> >>> On 08/01/18 10:08, Thomas Gleixner wrote:
> >>>> On Sat, 6 Jan 2018, tip-bot for Tom Lendacky wrote:
> >>>>
> >>>>> Commit-ID: 0bf17c102177d5da9363bf8b1e4704b9996d5079
> >>>>> Gitweb: https://git.kernel.org/tip/0bf17c102177d5da9363bf8b1e4704b9996d5079
> >>>>> Author: Tom Lendacky <[email protected]>
> >>>>> AuthorDate: Fri, 5 Jan 2018 10:07:56 -0600
> >>>>> Committer: Thomas Gleixner <[email protected]>
> >>>>> CommitDate: Sat, 6 Jan 2018 21:57:40 +0100
> >>>>>
> >>>>> x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC
> >>>>>
> >>>>> With LFENCE now a serializing instruction, set the LFENCE_RDTSC
> >>>>> feature since the LFENCE instruction has less overhead than the
> >>>>> MFENCE instruction.
> >>>> Second thoughts on that. As pointed out by someone in one of the insane
> >>>> long threads:
> >>>>
> >>>> What happens if the kernel runs as a guest and
> >>>>
> >>>> - the hypervisor did not set the LFENCE to serializing on the host
> >>>>
> >>>> - the hypervisor does not allow writing MSR_AMD64_DE_CFG
> >>>>
> >>>> That would bring the guest into a pretty bad state or am I missing
> >>>> something essential here?
> >>> What I did in Xen was to attempt to set it, then read it back and see.?
> >>> If LFENCE still isn't serialising, using repoline is the only available
> >>> mitigation.
> >>>
> >>> My understanding from the folk at AMD is that retpoline is safe to use,
> >>> but has higher overhead than the LFENCE approach.
> > Correct, the retpoline will work, it just takes more cycles.
> >
> >> That still does not help vs. rdtsc_ordered() and LFENCE_RDTSC ...
> > Ok, I can add the read-back check before setting the feature flag(s).
> >
> > But... what about the case where the guest is a different family than
> > hypervisor? If we're on, say, a Fam15h hypervisor but the guest is started
> > as a Fam0fh guest where the MSR doesn't exist and LFENCE is supposed to be
> > serialized? I'll have to do a rdmsr_safe() and only set the flag(s) if I
> > can successfully read the MSR back and validate the bit.
>
> If your hypervisor is lying to you about the primary family, then all
> bets are off.? I don't expect there will be any production systems doing
> this.

It's not that an unusual thing to do on qemu/kvm - to specify the lowest
common denominator of the set of CPUs in your data centre (for any one
vendor); it does tend to get some weird combinations.

Dave

> The user can get to keep both pieces if they've decided that this was a
> good thing to try.
>
> ~Andrew
--
Dr. David Alan Gilbert / [email protected] / Manchester, UK

2018-01-08 17:01:44

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [tip:x86/pti] x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC

On 08/01/2018 17:48, Dr. David Alan Gilbert wrote:
>> If your hypervisor is lying to you about the primary family, then all
>> bets are off.  I don't expect there will be any production systems doing
>> this.
> It's not that an unusual thing to do on qemu/kvm - to specify the lowest
> common denominator of the set of CPUs in your data centre (for any one
> vendor); it does tend to get some weird combinations.

Agreed. But on a hypervisor we pretty much know that:

- the MSR_AMD64_DE_CFG doesn't exist unless you have a fix

- setting the MSR_AMD64_DE_CFG bit to 1 if you have a fix can be done
independent of the family

So all KVM needs is a X86_FEATURE_LFENCE_SERIALIZE, it doesn't matter if
it's because of the family or because Linux has set MSR_F10H_DE_CFG.
The guest will either try setting the MSR bit and #GP, or it will find
it already set and do nothing.

Of course no code for this has been written yet.

Paolo

2018-01-08 17:31:51

by Tom Lendacky

[permalink] [raw]
Subject: Re: [tip:x86/pti] x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC

On 1/8/2018 9:15 AM, Thomas Gleixner wrote:
> On Mon, 8 Jan 2018, Tom Lendacky wrote:
>> On 1/8/2018 5:10 AM, Thomas Gleixner wrote:
>>>>> Second thoughts on that. As pointed out by someone in one of the insane
>>>>> long threads:
>>>>>
>>>>> What happens if the kernel runs as a guest and
>>>>>
>>>>> - the hypervisor did not set the LFENCE to serializing on the host
>>>>>
>>>>> - the hypervisor does not allow writing MSR_AMD64_DE_CFG
>>>>>
>>>>> That would bring the guest into a pretty bad state or am I missing
>>>>> something essential here?
>>>>
>>>> What I did in Xen was to attempt to set it, then read it back and see. 
>>>> If LFENCE still isn't serialising, using repoline is the only available
>>>> mitigation.
>>>>
>>>> My understanding from the folk at AMD is that retpoline is safe to use,
>>>> but has higher overhead than the LFENCE approach.
>>
>> Correct, the retpoline will work, it just takes more cycles.
>>
>>>
>>> That still does not help vs. rdtsc_ordered() and LFENCE_RDTSC ...
>>
>> Ok, I can add the read-back check before setting the feature flag(s).
>>
>> But... what about the case where the guest is a different family than
>> hypervisor? If we're on, say, a Fam15h hypervisor but the guest is started
>> as a Fam0fh guest where the MSR doesn't exist and LFENCE is supposed to be
>> serialized? I'll have to do a rdmsr_safe() and only set the flag(s) if I
>> can successfully read the MSR back and validate the bit.
>
> But that still does not make this patch correct and neither the next one.
>
> If you cannot set the flag and you cannot prove that you run on bare metal,
> then you still need the MFENCE_RDTSC variant as you have no idea what the
> underlying hypervisor is and how it has the LFENCE configured.

So now I'm also concerned about setting the retpoline method and using
LFENCE as the speculation barrier. If we go back to the original
statement:

- the hypervisor did not set the LFENCE to serializing on the host
- the hypervisor does not allow writing MSR_AMD64_DE_CFG

It looks like I'll need to attempt to write MSR_AMD64_DE_CFG and then read
it back checking for whether MSR_F10H_DECFG_LFENCE_SERIALIZE has been set.
If the MSR can be read and the bit is set, then I can set RETPOLINE_AMD
(once those patches go in) and LFENCE_RDTSC. Otherwise, set (only)
MFENCE_RDTSC. This also means we need to use MFENCE as the speculation
barrier instead of LFENCE in this situation (another alternative added to
the __nospec_barrier() implementation).

Does that sound right? Or does the "cannot prove that you run on bare
metal" mess this all up?

Thanks,
Tom

>
> Thanks,
>
> tglx
>

2018-01-08 17:39:39

by Tom Lendacky

[permalink] [raw]
Subject: Re: [tip:x86/pti] x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC

On 1/8/2018 11:01 AM, Paolo Bonzini wrote:
> On 08/01/2018 17:48, Dr. David Alan Gilbert wrote:
>>> If your hypervisor is lying to you about the primary family, then all
>>> bets are off.  I don't expect there will be any production systems doing
>>> this.
>> It's not that an unusual thing to do on qemu/kvm - to specify the lowest
>> common denominator of the set of CPUs in your data centre (for any one
>> vendor); it does tend to get some weird combinations.
>
> Agreed. But on a hypervisor we pretty much know that:
>
> - the MSR_AMD64_DE_CFG doesn't exist unless you have a fix

Not sure what you mean by this... the MSR exists today on many families.

Thanks,
Tom

>
> - setting the MSR_AMD64_DE_CFG bit to 1 if you have a fix can be done
> independent of the family
>
> So all KVM needs is a X86_FEATURE_LFENCE_SERIALIZE, it doesn't matter if
> it's because of the family or because Linux has set MSR_F10H_DE_CFG.
> The guest will either try setting the MSR bit and #GP, or it will find
> it already set and do nothing.
>
> Of course no code for this has been written yet.
>
> Paolo
>

2018-01-08 17:42:44

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [tip:x86/pti] x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC

On 08/01/2018 18:39, Tom Lendacky wrote:
> On 1/8/2018 11:01 AM, Paolo Bonzini wrote:
>> On 08/01/2018 17:48, Dr. David Alan Gilbert wrote:
>>>> If your hypervisor is lying to you about the primary family, then all
>>>> bets are off.  I don't expect there will be any production systems doing
>>>> this.
>>> It's not that an unusual thing to do on qemu/kvm - to specify the lowest
>>> common denominator of the set of CPUs in your data centre (for any one
>>> vendor); it does tend to get some weird combinations.
>>
>> Agreed. But on a hypervisor we pretty much know that:
>>
>> - the MSR_AMD64_DE_CFG doesn't exist unless you have a fix
>
> Not sure what you mean by this... the MSR exists today on many families.

But the hypervisor either should not expose it at all to the guest, or
if it does, it should not allow setting that bit to 1.

Paolo

> Thanks,
> Tom
>
>>
>> - setting the MSR_AMD64_DE_CFG bit to 1 if you have a fix can be done
>> independent of the family
>>
>> So all KVM needs is a X86_FEATURE_LFENCE_SERIALIZE, it doesn't matter if
>> it's because of the family or because Linux has set MSR_F10H_DE_CFG.
>> The guest will either try setting the MSR bit and #GP, or it will find
>> it already set and do nothing.
>>
>> Of course no code for this has been written yet.
>>
>> Paolo
>>

2018-01-08 20:57:54

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [tip:x86/pti] x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC

On Mon, 8 Jan 2018, Tom Lendacky wrote:
> So now I'm also concerned about setting the retpoline method and using
> LFENCE as the speculation barrier. If we go back to the original
> statement:
>
> - the hypervisor did not set the LFENCE to serializing on the host
> - the hypervisor does not allow writing MSR_AMD64_DE_CFG
>
> It looks like I'll need to attempt to write MSR_AMD64_DE_CFG and then read
> it back checking for whether MSR_F10H_DECFG_LFENCE_SERIALIZE has been set.
> If the MSR can be read and the bit is set, then I can set RETPOLINE_AMD
> (once those patches go in) and LFENCE_RDTSC. Otherwise, set (only)
> MFENCE_RDTSC. This also means we need to use MFENCE as the speculation
> barrier instead of LFENCE in this situation (another alternative added to
> the __nospec_barrier() implementation).
>
> Does that sound right? Or does the "cannot prove that you run on bare
> metal" mess this all up?

I think that covers it. If the hypervisor would be trapping the read and
then lying on the bit being set while it's not, that's really nothing you
should care about at all.

Thanks,

tglx

2018-01-17 17:22:07

by Tom Lendacky

[permalink] [raw]
Subject: Re: [tip:x86/pti] x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC

On 1/8/2018 11:01 AM, Paolo Bonzini wrote:
> On 08/01/2018 17:48, Dr. David Alan Gilbert wrote:
>>> If your hypervisor is lying to you about the primary family, then all
>>> bets are off.  I don't expect there will be any production systems doing
>>> this.
>> It's not that an unusual thing to do on qemu/kvm - to specify the lowest
>> common denominator of the set of CPUs in your data centre (for any one
>> vendor); it does tend to get some weird combinations.
>
> Agreed. But on a hypervisor we pretty much know that:
>
> - the MSR_AMD64_DE_CFG doesn't exist unless you have a fix
>
> - setting the MSR_AMD64_DE_CFG bit to 1 if you have a fix can be done
> independent of the family
>
> So all KVM needs is a X86_FEATURE_LFENCE_SERIALIZE, it doesn't matter if
> it's because of the family or because Linux has set MSR_F10H_DE_CFG.
> The guest will either try setting the MSR bit and #GP, or it will find
> it already set and do nothing.
>
> Of course no code for this has been written yet.
>

Hi Paolo,

What would be the best way to approach the MSR support? I was thinking of
just recognizing a write to that MSR but not actually doing anything and,
on read, just returning a value with the single bit set if LFENCE is
serializing and not worrying about the full contents of the MSR. Or I
could save the value so that it could also be host initiated and only
allow the LFENCE serialization bit to be set if the LFENCE_RDTSC feature
is enabled.

Thanks,
Tom

> Paolo
>

2018-01-17 17:55:01

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [tip:x86/pti] x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC

On 17/01/2018 18:21, Tom Lendacky wrote:
> On 1/8/2018 11:01 AM, Paolo Bonzini wrote:
>> On 08/01/2018 17:48, Dr. David Alan Gilbert wrote:
>>>> If your hypervisor is lying to you about the primary family, then all
>>>> bets are off.  I don't expect there will be any production systems doing
>>>> this.
>>> It's not that an unusual thing to do on qemu/kvm - to specify the lowest
>>> common denominator of the set of CPUs in your data centre (for any one
>>> vendor); it does tend to get some weird combinations.
>>
>> Agreed. But on a hypervisor we pretty much know that:
>>
>> - the MSR_AMD64_DE_CFG doesn't exist unless you have a fix
>>
>> - setting the MSR_AMD64_DE_CFG bit to 1 if you have a fix can be done
>> independent of the family
>>
>> So all KVM needs is a X86_FEATURE_LFENCE_SERIALIZE, it doesn't matter if
>> it's because of the family or because Linux has set MSR_F10H_DE_CFG.
>> The guest will either try setting the MSR bit and #GP, or it will find
>> it already set and do nothing.
>>
>> Of course no code for this has been written yet.
>>
>
> Hi Paolo,
>
> What would be the best way to approach the MSR support? I was thinking of
> just recognizing a write to that MSR but not actually doing anything and,
> on read, just returning a value with the single bit set if LFENCE is
> serializing and not worrying about the full contents of the MSR. Or I
> could save the value so that it could also be host initiated and only
> allow the LFENCE serialization bit to be set if the LFENCE_RDTSC feature
> is enabled.

Yes, the latter is the correct one. We'll need changes in QEMU to add a
new feature bit in "-cpu" too. The "-cpu" feature bit, if set, causes
QEMU to set the bit in the MSR at CPU creation time. MSR-based features
are not yet a thing in QEMU, but we were planning to add them before
this whole kerfuffle started.

But indeed we need to return also whether the feature is supported on
the host, which would be similar to the first part (read-only, just
returning a value with the single bit set is LFENCE is serializing). We
would add KVM_GET_MSRS and KVM_GET_MSR_INDEX_LIST ioctls on the VM file
descriptor for that, and a new capability KVM_CAP_GET_HOST_MSR (or
KVM_CAP_GET_MSR_VM, you choose :)). QEMU can use these two ioctls to
query the available MSR-based CPU features. These can include microcode
version, VMX features, LFENCE serialization, IA32_ARCH_FACILITIES, etc.

Thanks,

Paolo