Subject: Re: [RFC v1 13/26] x86/tdx: Handle MWAIT, MONITOR and WBINVD

Hi Andy,

On 2/5/21 3:43 PM, Andy Lutomirski wrote:
> MWAIT turning into NOP is no good. How about suppressing
> X86_FEATURE_MWAIT instead?
Yes, we can suppress it in tdx_early_init().

+ setup_clear_cpu_cap(X86_FEATURE_MWAIT);

But do you want to leave the MWAIT #VE handler as it as
(just in case)?


--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer


2021-02-06 03:25:12

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [RFC v1 13/26] x86/tdx: Handle MWAIT, MONITOR and WBINVD

On Fri, Feb 5, 2021 at 3:54 PM Kuppuswamy, Sathyanarayanan
<[email protected]> wrote:
>
> Hi Andy,
>
> On 2/5/21 3:43 PM, Andy Lutomirski wrote:
> > MWAIT turning into NOP is no good. How about suppressing
> > X86_FEATURE_MWAIT instead?
> Yes, we can suppress it in tdx_early_init().
>
> + setup_clear_cpu_cap(X86_FEATURE_MWAIT);
>
> But do you want to leave the MWAIT #VE handler as it as
> (just in case)?
>

I would suggest decoding the error, printing a useful message, and
oopsing or at least warning.

>
> --
> Sathyanarayanan Kuppuswamy
> Linux Kernel Developer

Subject: [PATCH v1 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

In non-root TDX guest mode, MWAIT, MONITOR and WBINVD instructions
are not supported. So handle #VE due to these instructions as no ops.

Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
---

Changes since previous series:
* Suppressed MWAIT feature as per Andi's comment.
* Added warning debug log for MWAIT #VE exception.

arch/x86/kernel/tdx.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)

diff --git a/arch/x86/kernel/tdx.c b/arch/x86/kernel/tdx.c
index e936b2f88bf6..fb7d22b846fc 100644
--- a/arch/x86/kernel/tdx.c
+++ b/arch/x86/kernel/tdx.c
@@ -308,6 +308,9 @@ void __init tdx_early_init(void)

setup_force_cpu_cap(X86_FEATURE_TDX_GUEST);

+ /* MWAIT is not supported in TDX platform, so suppress it */
+ setup_clear_cpu_cap(X86_FEATURE_MWAIT);
+
tdg_get_info();

pv_ops.irq.safe_halt = tdg_safe_halt;
@@ -362,6 +365,26 @@ int tdg_handle_virtualization_exception(struct pt_regs *regs,
case EXIT_REASON_EPT_VIOLATION:
ve->instr_len = tdg_handle_mmio(regs, ve);
break;
+ /*
+ * Per Guest-Host-Communication Interface (GHCI) for Intel Trust
+ * Domain Extensions (Intel TDX) specification, sec 2.4,
+ * some instructions that unconditionally cause #VE (such as WBINVD,
+ * MONITOR, MWAIT) do not have corresponding TDCALL
+ * [TDG.VP.VMCALL <Instruction>] leaves, since the TD has been designed
+ * with no deterministic way to confirm the result of those operations
+ * performed by the host VMM. In those cases, the goal is for the TD
+ * #VE handler to increment the RIP appropriately based on the VE
+ * information provided via TDCALL.
+ */
+ case EXIT_REASON_WBINVD:
+ pr_warn_once("WBINVD #VE Exception\n");
+ case EXIT_REASON_MONITOR_INSTRUCTION:
+ /* Handle as nops. */
+ break;
+ case EXIT_REASON_MWAIT_INSTRUCTION:
+ /* MWAIT is supressed, not supposed to reach here. */
+ pr_warn("MWAIT unexpected #VE Exception\n");
+ return -EFAULT;
default:
pr_warn("Unexpected #VE: %d\n", ve->exit_reason);
return -EFAULT;
--
2.25.1

2021-03-27 02:42:48

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v1 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD



> On Mar 26, 2021, at 5:18 PM, Kuppuswamy Sathyanarayanan <[email protected]> wrote:
>
> In non-root TDX guest mode, MWAIT, MONITOR and WBINVD instructions
> are not supported. So handle #VE due to these instructions as no ops.

These should at least be WARN.

Does TDX send #UD if these instructions have the wrong CPL? If the #VE came from user mode, we should send an appropriate signal instead.

>
> Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]>
> Reviewed-by: Andi Kleen <[email protected]>
> ---
>
> Changes since previous series:
> * Suppressed MWAIT feature as per Andi's comment.
> * Added warning debug log for MWAIT #VE exception.
>
> arch/x86/kernel/tdx.c | 23 +++++++++++++++++++++++
> 1 file changed, 23 insertions(+)
>
> diff --git a/arch/x86/kernel/tdx.c b/arch/x86/kernel/tdx.c
> index e936b2f88bf6..fb7d22b846fc 100644
> --- a/arch/x86/kernel/tdx.c
> +++ b/arch/x86/kernel/tdx.c
> @@ -308,6 +308,9 @@ void __init tdx_early_init(void)
>
> setup_force_cpu_cap(X86_FEATURE_TDX_GUEST);
>
> + /* MWAIT is not supported in TDX platform, so suppress it */
> + setup_clear_cpu_cap(X86_FEATURE_MWAIT);
> +
> tdg_get_info();
>
> pv_ops.irq.safe_halt = tdg_safe_halt;
> @@ -362,6 +365,26 @@ int tdg_handle_virtualization_exception(struct pt_regs *regs,
> case EXIT_REASON_EPT_VIOLATION:
> ve->instr_len = tdg_handle_mmio(regs, ve);
> break;
> + /*
> + * Per Guest-Host-Communication Interface (GHCI) for Intel Trust
> + * Domain Extensions (Intel TDX) specification, sec 2.4,
> + * some instructions that unconditionally cause #VE (such as WBINVD,
> + * MONITOR, MWAIT) do not have corresponding TDCALL
> + * [TDG.VP.VMCALL <Instruction>] leaves, since the TD has been designed
> + * with no deterministic way to confirm the result of those operations
> + * performed by the host VMM. In those cases, the goal is for the TD
> + * #VE handler to increment the RIP appropriately based on the VE
> + * information provided via TDCALL.
> + */
> + case EXIT_REASON_WBINVD:
> + pr_warn_once("WBINVD #VE Exception\n");
> + case EXIT_REASON_MONITOR_INSTRUCTION:
> + /* Handle as nops. */
> + break;
> + case EXIT_REASON_MWAIT_INSTRUCTION:
> + /* MWAIT is supressed, not supposed to reach here. */
> + pr_warn("MWAIT unexpected #VE Exception\n");
> + return -EFAULT;
> default:
> pr_warn("Unexpected #VE: %d\n", ve->exit_reason);
> return -EFAULT;
> --
> 2.25.1
>

Subject: Re: [PATCH v1 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD



On 3/26/21 7:40 PM, Andy Lutomirski wrote:
>
>
>> On Mar 26, 2021, at 5:18 PM, Kuppuswamy Sathyanarayanan <[email protected]> wrote:
>>
>> In non-root TDX guest mode, MWAIT, MONITOR and WBINVD instructions
>> are not supported. So handle #VE due to these instructions as no ops.
>
> These should at least be WARN.
I will change it to WARN.
>
> Does TDX send #UD if these instructions have the wrong CPL?
No, TDX does not trigger #UD for these instructions.
If the #VE came from user mode, we should send an appropriate signal instead.
It will be mapped into #GP(0) fault. This should be enough notification right?
>
>>
>> Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]>
>> Reviewed-by: Andi Kleen <[email protected]>
>> ---
>>
>> Changes since previous series:
>> * Suppressed MWAIT feature as per Andi's comment.
>> * Added warning debug log for MWAIT #VE exception.
>>
>> arch/x86/kernel/tdx.c | 23 +++++++++++++++++++++++
>> 1 file changed, 23 insertions(+)
>>
>> diff --git a/arch/x86/kernel/tdx.c b/arch/x86/kernel/tdx.c
>> index e936b2f88bf6..fb7d22b846fc 100644
>> --- a/arch/x86/kernel/tdx.c
>> +++ b/arch/x86/kernel/tdx.c
>> @@ -308,6 +308,9 @@ void __init tdx_early_init(void)
>>
>> setup_force_cpu_cap(X86_FEATURE_TDX_GUEST);
>>
>> + /* MWAIT is not supported in TDX platform, so suppress it */
>> + setup_clear_cpu_cap(X86_FEATURE_MWAIT);
>> +
>> tdg_get_info();
>>
>> pv_ops.irq.safe_halt = tdg_safe_halt;
>> @@ -362,6 +365,26 @@ int tdg_handle_virtualization_exception(struct pt_regs *regs,
>> case EXIT_REASON_EPT_VIOLATION:
>> ve->instr_len = tdg_handle_mmio(regs, ve);
>> break;
>> + /*
>> + * Per Guest-Host-Communication Interface (GHCI) for Intel Trust
>> + * Domain Extensions (Intel TDX) specification, sec 2.4,
>> + * some instructions that unconditionally cause #VE (such as WBINVD,
>> + * MONITOR, MWAIT) do not have corresponding TDCALL
>> + * [TDG.VP.VMCALL <Instruction>] leaves, since the TD has been designed
>> + * with no deterministic way to confirm the result of those operations
>> + * performed by the host VMM. In those cases, the goal is for the TD
>> + * #VE handler to increment the RIP appropriately based on the VE
>> + * information provided via TDCALL.
>> + */
>> + case EXIT_REASON_WBINVD:
>> + pr_warn_once("WBINVD #VE Exception\n");
>> + case EXIT_REASON_MONITOR_INSTRUCTION:
>> + /* Handle as nops. */
>> + break;
>> + case EXIT_REASON_MWAIT_INSTRUCTION:
>> + /* MWAIT is supressed, not supposed to reach here. */
>> + pr_warn("MWAIT unexpected #VE Exception\n");
>> + return -EFAULT;
>> default:
>> pr_warn("Unexpected #VE: %d\n", ve->exit_reason);
>> return -EFAULT;
>> --
>> 2.25.1
>>

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

2021-03-27 16:05:15

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v1 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD




> On Mar 26, 2021, at 8:40 PM, Kuppuswamy, Sathyanarayanan <[email protected]> wrote:
>
> 
>
> On 3/26/21 7:40 PM, Andy Lutomirski wrote:
>>>> On Mar 26, 2021, at 5:18 PM, Kuppuswamy Sathyanarayanan <[email protected]> wrote:
>>>
>>> In non-root TDX guest mode, MWAIT, MONITOR and WBINVD instructions
>>> are not supported. So handle #VE due to these instructions as no ops.
>> These should at least be WARN.
> I will change it to WARN.
>> Does TDX send #UD if these instructions have the wrong CPL?
> No, TDX does not trigger #UD for these instructions.
> If the #VE came from user mode, we should send an appropriate signal instead.
> It will be mapped into #GP(0) fault. This should be enough notification right?

Yes. And I did mean #GP, not #UD.

Subject: [PATCH v2 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

In non-root TDX guest mode, MWAIT, MONITOR and WBINVD instructions
are not supported. So handle #VE due to these instructions as no ops.

Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
---

Changes since v1:
* Added WARN() for MWAIT #VE exception.

Changes since previous series:
* Suppressed MWAIT feature as per Andi's comment.
* Added warning debug log for MWAIT #VE exception.

arch/x86/kernel/tdx.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)

diff --git a/arch/x86/kernel/tdx.c b/arch/x86/kernel/tdx.c
index e936b2f88bf6..3c6158a53c17 100644
--- a/arch/x86/kernel/tdx.c
+++ b/arch/x86/kernel/tdx.c
@@ -308,6 +308,9 @@ void __init tdx_early_init(void)

setup_force_cpu_cap(X86_FEATURE_TDX_GUEST);

+ /* MWAIT is not supported in TDX platform, so suppress it */
+ setup_clear_cpu_cap(X86_FEATURE_MWAIT);
+
tdg_get_info();

pv_ops.irq.safe_halt = tdg_safe_halt;
@@ -362,6 +365,26 @@ int tdg_handle_virtualization_exception(struct pt_regs *regs,
case EXIT_REASON_EPT_VIOLATION:
ve->instr_len = tdg_handle_mmio(regs, ve);
break;
+ /*
+ * Per Guest-Host-Communication Interface (GHCI) for Intel Trust
+ * Domain Extensions (Intel TDX) specification, sec 2.4,
+ * some instructions that unconditionally cause #VE (such as WBINVD,
+ * MONITOR, MWAIT) do not have corresponding TDCALL
+ * [TDG.VP.VMCALL <Instruction>] leaves, since the TD has been designed
+ * with no deterministic way to confirm the result of those operations
+ * performed by the host VMM. In those cases, the goal is for the TD
+ * #VE handler to increment the RIP appropriately based on the VE
+ * information provided via TDCALL.
+ */
+ case EXIT_REASON_WBINVD:
+ pr_warn_once("WBINVD #VE Exception\n");
+ case EXIT_REASON_MONITOR_INSTRUCTION:
+ /* Handle as nops. */
+ break;
+ case EXIT_REASON_MWAIT_INSTRUCTION:
+ /* MWAIT is supressed, not supposed to reach here. */
+ WARN(1, "MWAIT unexpected #VE Exception\n");
+ return -EFAULT;
default:
pr_warn("Unexpected #VE: %d\n", ve->exit_reason);
return -EFAULT;
--
2.25.1

2021-03-29 17:16:12

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

On 3/27/21 3:54 PM, Kuppuswamy Sathyanarayanan wrote:
> + /*
> + * Per Guest-Host-Communication Interface (GHCI) for Intel Trust
> + * Domain Extensions (Intel TDX) specification, sec 2.4,
> + * some instructions that unconditionally cause #VE (such as WBINVD,
> + * MONITOR, MWAIT) do not have corresponding TDCALL
> + * [TDG.VP.VMCALL <Instruction>] leaves, since the TD has been designed
> + * with no deterministic way to confirm the result of those operations
> + * performed by the host VMM. In those cases, the goal is for the TD
> + * #VE handler to increment the RIP appropriately based on the VE
> + * information provided via TDCALL.
> + */

That's an awfully big comment. Could you pare it down, please? Maybe
focus on the fact that we should never get here and why, rather than
talking about some silly spec?

> + case EXIT_REASON_WBINVD:
> + pr_warn_once("WBINVD #VE Exception\n");

I actually think WBINVD in here should oops. We use it for some really
important things. If it can't be executed, and we're depending on it,
the kernel is in deep, deep trouble.

I think a noop here is dangerous.

> + case EXIT_REASON_MONITOR_INSTRUCTION:
> + /* Handle as nops. */
> + break;

MONITOR is a privileged instruction, right? So we can only end up in
here if the kernel screws up and isn't reading CPUID correctly, right?

That dosen't seem to me like something we want to suppress. This needs
a warning, at least. I assume that having a MONITOR instruction
immediately return doesn't do any harm.

> + case EXIT_REASON_MWAIT_INSTRUCTION:
> + /* MWAIT is supressed, not supposed to reach here. */
> + WARN(1, "MWAIT unexpected #VE Exception\n");
> + return -EFAULT;

How is MWAIT "supppressed"?

Subject: Re: [PATCH v2 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD



On 3/29/21 10:14 AM, Dave Hansen wrote:
> On 3/27/21 3:54 PM, Kuppuswamy Sathyanarayanan wrote:
>> + /*
>> + * Per Guest-Host-Communication Interface (GHCI) for Intel Trust
>> + * Domain Extensions (Intel TDX) specification, sec 2.4,
>> + * some instructions that unconditionally cause #VE (such as WBINVD,
>> + * MONITOR, MWAIT) do not have corresponding TDCALL
>> + * [TDG.VP.VMCALL <Instruction>] leaves, since the TD has been designed
>> + * with no deterministic way to confirm the result of those operations
>> + * performed by the host VMM. In those cases, the goal is for the TD
>> + * #VE handler to increment the RIP appropriately based on the VE
>> + * information provided via TDCALL.
>> + */

>
> That's an awfully big comment. Could you pare it down, please? Maybe
> focus on the fact that we should never get here and why, rather than
> talking about some silly spec?
I will remove this and add individual one line comment for WBINVD and MONITOR
instructions. Some thing like "Privileged instruction, can only be executed
in ring 0. So raise a BUG.
>
>> + case EXIT_REASON_WBINVD:
>> + pr_warn_once("WBINVD #VE Exception\n");
>
> I actually think WBINVD in here should oops. We use it for some really
> important things. If it can't be executed, and we're depending on it,
> the kernel is in deep, deep trouble.
Agree. I will call BUG().
>
> I think a noop here is dangerous.
>
>> + case EXIT_REASON_MONITOR_INSTRUCTION:
>> + /* Handle as nops. */
>> + break;
>
> MONITOR is a privileged instruction, right? So we can only end up in
> here if the kernel screws up and isn't reading CPUID correctly, right?
>
> That dosen't seem to me like something we want to suppress. This needs
> a warning, at least. I assume that having a MONITOR instruction
> immediately return doesn't do any harm.
Agree. Since we are not supposed to come here, I will use BUG.
>
>> + case EXIT_REASON_MWAIT_INSTRUCTION:
>> + /* MWAIT is supressed, not supposed to reach here. */
>> + WARN(1, "MWAIT unexpected #VE Exception\n");
>> + return -EFAULT;
>
> How is MWAIT "supppressed"?
I am clearing the MWAIT feature flag in early init code. We should also disable
this feature in firmware.
setup_clear_cpu_cap(X86_FEATURE_MWAIT);
>

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

2021-03-29 22:04:16

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

On 3/29/21 2:55 PM, Kuppuswamy, Sathyanarayanan wrote:
>>
>> MONITOR is a privileged instruction, right?  So we can only end up in
>> here if the kernel screws up and isn't reading CPUID correctly, right?
>>
>> That dosen't seem to me like something we want to suppress.  This needs
>> a warning, at least.  I assume that having a MONITOR instruction
>> immediately return doesn't do any harm.
> Agree. Since we are not supposed to come here, I will use BUG.

"This is unexpected" is a WARN()able offense.

"This is unexpected and might be corrupting data" is where we want to
use BUG().

Does broken MONITOR risk data corruption?

>>> +    case EXIT_REASON_MWAIT_INSTRUCTION:
>>> +        /* MWAIT is supressed, not supposed to reach here. */
>>> +        WARN(1, "MWAIT unexpected #VE Exception\n");
>>> +        return -EFAULT;
>>
>> How is MWAIT "supppressed"?
> I am clearing the MWAIT feature flag in early init code. We should also
> disable this feature in firmware. setup_clear_cpu_cap(X86_FEATURE_MWAIT);

I'd be more explicit about that. Maybe even reference the code that
clears the X86_FEATURE.

Subject: Re: [PATCH v2 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD



On 3/29/21 3:02 PM, Dave Hansen wrote:
> On 3/29/21 2:55 PM, Kuppuswamy, Sathyanarayanan wrote:
>>>
>>> MONITOR is a privileged instruction, right?  So we can only end up in
>>> here if the kernel screws up and isn't reading CPUID correctly, right?
>>>
>>> That dosen't seem to me like something we want to suppress.  This needs
>>> a warning, at least.  I assume that having a MONITOR instruction
>>> immediately return doesn't do any harm.
>> Agree. Since we are not supposed to come here, I will use BUG.
>
> "This is unexpected" is a WARN()able offense.
>
> "This is unexpected and might be corrupting data" is where we want to
> use BUG().
>
> Does broken MONITOR risk data corruption?
We will be reaching this point only if something is buggy in kernel. I am
not sure about impact of this buggy state. But MONITOR instruction by
itself, should not cause data corruption.

>
>>>> +    case EXIT_REASON_MWAIT_INSTRUCTION:
>>>> +        /* MWAIT is supressed, not supposed to reach here. */
>>>> +        WARN(1, "MWAIT unexpected #VE Exception\n");
>>>> +        return -EFAULT;
>>>
>>> How is MWAIT "supppressed"?
>> I am clearing the MWAIT feature flag in early init code. We should also
>> disable this feature in firmware. setup_clear_cpu_cap(X86_FEATURE_MWAIT);
>
> I'd be more explicit about that. Maybe even reference the code that
> clears the X86_FEATURE.
This change is part of the same patch.
>

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

2021-03-29 22:14:58

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

On 3/29/21 3:09 PM, Kuppuswamy, Sathyanarayanan wrote:
>>>>> +    case EXIT_REASON_MWAIT_INSTRUCTION:
>>>>> +        /* MWAIT is supressed, not supposed to reach here. */
>>>>> +        WARN(1, "MWAIT unexpected #VE Exception\n");
>>>>> +        return -EFAULT;
>>>>
>>>> How is MWAIT "supppressed"?
>>> I am clearing the MWAIT feature flag in early init code. We should also
>>> disable this feature in firmware.
>>> setup_clear_cpu_cap(X86_FEATURE_MWAIT);
>>
>> I'd be more explicit about that.  Maybe even reference the code that
>> clears the X86_FEATURE.
> This change is part of the same patch.

Right, but if someone goes and looks at the switch() statement in 10
years is it going to be obvious how MWAIT was "suppressed"?

Subject: Re: [PATCH v2 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD



On 3/29/21 3:12 PM, Dave Hansen wrote:
> On 3/29/21 3:09 PM, Kuppuswamy, Sathyanarayanan wrote:
>>>>>> +    case EXIT_REASON_MWAIT_INSTRUCTION:
>>>>>> +        /* MWAIT is supressed, not supposed to reach here. */
>>>>>> +        WARN(1, "MWAIT unexpected #VE Exception\n");
>>>>>> +        return -EFAULT;
>>>>>
>>>>> How is MWAIT "supppressed"?
>>>> I am clearing the MWAIT feature flag in early init code. We should also
>>>> disable this feature in firmware.
>>>> setup_clear_cpu_cap(X86_FEATURE_MWAIT);
>>>
>>> I'd be more explicit about that.  Maybe even reference the code that
>>> clears the X86_FEATURE.
>> This change is part of the same patch.
>
> Right, but if someone goes and looks at the switch() statement in 10
> years is it going to be obvious how MWAIT was "suppressed"?
Ok. I can add a comment line for it.
>

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

Subject: [PATCH v3 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

In non-root TDX guest mode, MWAIT, MONITOR and WBINVD instructions
are not supported. So handle #VE due to these instructions
appropriately.

Since the impact of executing WBINVD instruction in non ring-0 mode
can be heavy, use BUG() to report it. For others, raise a WARNING
message.

Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
---

Changes since v2:
* Added BUG() for WBINVD, WARN for MONITOR instructions.
* Fixed comments as per Dave's review.

Changes since v1:
* Added WARN() for MWAIT #VE exception.

Changes since previous series:
* Suppressed MWAIT feature as per Andi's comment.
* Added warning debug log for MWAIT #VE exception.

arch/x86/kernel/tdx.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)

diff --git a/arch/x86/kernel/tdx.c b/arch/x86/kernel/tdx.c
index e936b2f88bf6..4c6336a055a3 100644
--- a/arch/x86/kernel/tdx.c
+++ b/arch/x86/kernel/tdx.c
@@ -308,6 +308,9 @@ void __init tdx_early_init(void)

setup_force_cpu_cap(X86_FEATURE_TDX_GUEST);

+ /* MWAIT is not supported in TDX platform, so suppress it */
+ setup_clear_cpu_cap(X86_FEATURE_MWAIT);
+
tdg_get_info();

pv_ops.irq.safe_halt = tdg_safe_halt;
@@ -362,6 +365,32 @@ int tdg_handle_virtualization_exception(struct pt_regs *regs,
case EXIT_REASON_EPT_VIOLATION:
ve->instr_len = tdg_handle_mmio(regs, ve);
break;
+ case EXIT_REASON_WBINVD:
+ /*
+ * WBINVD is a privileged instruction, can only be executed
+ * in ring 0. Since we reached here, the kernel is in buggy
+ * state.
+ */
+ pr_err("WBINVD #VE Exception\n");
+ BUG();
+ break;
+ case EXIT_REASON_MONITOR_INSTRUCTION:
+ /*
+ * MONITOR is a privileged instruction, can only be executed
+ * in ring 0. So we are not supposed to reach here. Raise a
+ * warning message.
+ */
+ WARN(1, "MONITOR unexpected #VE Exception\n");
+ break;
+ case EXIT_REASON_MWAIT_INSTRUCTION:
+ /*
+ * MWAIT feature is suppressed in firmware and in
+ * tdx_early_init() by clearing X86_FEATURE_MWAIT CPU feature
+ * flag. Since we are not supposed to reach here, raise a
+ * warning message and return -EFAULT.
+ */
+ WARN(1, "MWAIT unexpected #VE Exception\n");
+ return -EFAULT;
default:
pr_warn("Unexpected #VE: %d\n", ve->exit_reason);
return -EFAULT;
--
2.25.1

2021-03-29 23:26:03

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD


> On Mar 29, 2021, at 4:17 PM, Kuppuswamy Sathyanarayanan <[email protected]> wrote:
>
> In non-root TDX guest mode, MWAIT, MONITOR and WBINVD instructions
> are not supported. So handle #VE due to these instructions
> appropriately.

Is there something I missed elsewhere in the code that checks CPL?

Subject: Re: [PATCH v3 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD



On 3/29/21 4:23 PM, Andy Lutomirski wrote:
>
>> On Mar 29, 2021, at 4:17 PM, Kuppuswamy Sathyanarayanan <[email protected]> wrote:
>>
>> In non-root TDX guest mode, MWAIT, MONITOR and WBINVD instructions
>> are not supported. So handle #VE due to these instructions
>> appropriately.
>
> Is there something I missed elsewhere in the code that checks CPL?
We don't check for CPL explicitly. But if we are reaching here, then we
executing these instructions with wrong CPL.
>

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

2021-03-29 23:41:19

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

On Mon, Mar 29, 2021, Andy Lutomirski wrote:
>
> > On Mar 29, 2021, at 4:17 PM, Kuppuswamy Sathyanarayanan <[email protected]> wrote:
> >
> > In non-root TDX guest mode, MWAIT, MONITOR and WBINVD instructions
> > are not supported. So handle #VE due to these instructions
> > appropriately.
>
> Is there something I missed elsewhere in the code that checks CPL?

#GP due to CPL!=0 has priority over VM-Exit, i.e. userspace will get a #GP
directly; there will be no VM-Exit to the TDX Module and thus no #VE.

SDM section "25.1.1 - Relative Priority of Faults and VM Exits".

2021-03-29 23:41:50

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

On 3/29/21 4:16 PM, Kuppuswamy Sathyanarayanan wrote:
> In non-root TDX guest mode, MWAIT, MONITOR and WBINVD instructions
> are not supported. So handle #VE due to these instructions
> appropriately.

This misses a key detail:

"are not supported" ... and other patches have prevented a guest
from using these instructions.

In other words, they're bad now, and we know it. We tried to keep the
kernel from using them, but we failed. Whoops.

> Since the impact of executing WBINVD instruction in non ring-0 mode
> can be heavy, use BUG() to report it. For others, raise a WARNING
> message.

"Heavy" makes it sound like there's a performance impact.



> pv_ops.irq.safe_halt = tdg_safe_halt;
> @@ -362,6 +365,32 @@ int tdg_handle_virtualization_exception(struct pt_regs *regs,
> case EXIT_REASON_EPT_VIOLATION:
> ve->instr_len = tdg_handle_mmio(regs, ve);
> break;
> + case EXIT_REASON_WBINVD:
> + /*
> + * WBINVD is a privileged instruction, can only be executed
> + * in ring 0. Since we reached here, the kernel is in buggy
> + * state.
> + */
> + pr_err("WBINVD #VE Exception\n");

"#VE Exception" is basically wasted bytes. It really tells us nothing.
This, on the other hand:

"TDX guest used unsupported WBINVD instruction"

gives us the clues that TDX is involved and that the kernel used the
instruction. The fact that #VE itself is involved is kinda irrelevant.

I also hate the comment. Don't waste the bytes saying we're in a buggy
state. That's kinda obvious from the BUG().

Again, it might be nice to mention in the changelog *WHY* we're so sure
that WBINVD won't be called from a TDX guest. What did we do to
guarantee that? How could we be sure that someone looking at the splat
that this generates and then reading the comment can go fix the bug in
their kernel?

> + BUG();
> + break;
> + case EXIT_REASON_MONITOR_INSTRUCTION:
> + /*
> + * MONITOR is a privileged instruction, can only be executed
> + * in ring 0. So we are not supposed to reach here. Raise a
> + * warning message.
> + */
> + WARN(1, "MONITOR unexpected #VE Exception\n");
> + break;
> + case EXIT_REASON_MWAIT_INSTRUCTION:
> + /*
> + * MWAIT feature is suppressed in firmware and in
> + * tdx_early_init() by clearing X86_FEATURE_MWAIT CPU feature
> + * flag. Since we are not supposed to reach here, raise a
> + * warning message and return -EFAULT.
> + */
> + WARN(1, "MWAIT unexpected #VE Exception\n");
> + return -EFAULT;
> default:
> pr_warn("Unexpected #VE: %d\n", ve->exit_reason);
> return -EFAULT;

This is more of the same. Don't waste comment bytes telling me we're
not suppose to reach a BUG() or WARN().

2021-03-29 23:46:43

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

On Mon, Mar 29, 2021, Kuppuswamy, Sathyanarayanan wrote:
>
>
> On 3/29/21 4:23 PM, Andy Lutomirski wrote:
> >
> > > On Mar 29, 2021, at 4:17 PM, Kuppuswamy Sathyanarayanan <[email protected]> wrote:
> > >
> > > In non-root TDX guest mode, MWAIT, MONITOR and WBINVD instructions
> > > are not supported. So handle #VE due to these instructions
> > > appropriately.
> >
> > Is there something I missed elsewhere in the code that checks CPL?
> We don't check for CPL explicitly. But if we are reaching here, then we
> executing these instructions with wrong CPL.

No, if these instructions take a #VE then they were executed at CPL=0. MONITOR
and MWAIT will #UD without VM-Exit->#VE. Same for WBINVD, s/#UD/#GP.

2021-03-30 00:01:58

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

On Mon, Mar 29, 2021 at 4:42 PM Sean Christopherson <[email protected]> wrote:
>
> On Mon, Mar 29, 2021, Kuppuswamy, Sathyanarayanan wrote:
> >
> >
> > On 3/29/21 4:23 PM, Andy Lutomirski wrote:
> > >
> > > > On Mar 29, 2021, at 4:17 PM, Kuppuswamy Sathyanarayanan <[email protected]> wrote:
> > > >
> > > > In non-root TDX guest mode, MWAIT, MONITOR and WBINVD instructions
> > > > are not supported. So handle #VE due to these instructions
> > > > appropriately.
> > >
> > > Is there something I missed elsewhere in the code that checks CPL?
> > We don't check for CPL explicitly. But if we are reaching here, then we
> > executing these instructions with wrong CPL.
>
> No, if these instructions take a #VE then they were executed at CPL=0. MONITOR
> and MWAIT will #UD without VM-Exit->#VE. Same for WBINVD, s/#UD/#GP.

Dare I ask about XSETBV?

2021-03-30 02:07:28

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

> > No, if these instructions take a #VE then they were executed at CPL=0. MONITOR
> > and MWAIT will #UD without VM-Exit->#VE. Same for WBINVD, s/#UD/#GP.
>
> Dare I ask about XSETBV?

XGETBV does not cause a #VE, it just works normally. The guest has full
AVX capabilities.

-Andi

2021-03-30 03:02:11

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD


> On Mar 29, 2021, at 7:04 PM, Andi Kleen <[email protected]> wrote:
>
> 
>>
>>> No, if these instructions take a #VE then they were executed at CPL=0. MONITOR
>>> and MWAIT will #UD without VM-Exit->#VE. Same for WBINVD, s/#UD/#GP.
>>
>> Dare I ask about XSETBV?
>
> XGETBV does not cause a #VE, it just works normally. The guest has full
> AVX capabilities.
>

X *SET* BV

2021-03-30 05:00:17

by Xiaoyao Li

[permalink] [raw]
Subject: Re: [PATCH v1 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

On 3/27/2021 8:18 AM, Kuppuswamy Sathyanarayanan wrote:
> In non-root TDX guest mode, MWAIT, MONITOR and WBINVD instructions
> are not supported. So handle #VE due to these instructions as no ops.
>
> Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]>
> Reviewed-by: Andi Kleen <[email protected]>
> ---
>
> Changes since previous series:
> * Suppressed MWAIT feature as per Andi's comment.
> * Added warning debug log for MWAIT #VE exception.
>
> arch/x86/kernel/tdx.c | 23 +++++++++++++++++++++++
> 1 file changed, 23 insertions(+)
>
> diff --git a/arch/x86/kernel/tdx.c b/arch/x86/kernel/tdx.c
> index e936b2f88bf6..fb7d22b846fc 100644
> --- a/arch/x86/kernel/tdx.c
> +++ b/arch/x86/kernel/tdx.c
> @@ -308,6 +308,9 @@ void __init tdx_early_init(void)
>
> setup_force_cpu_cap(X86_FEATURE_TDX_GUEST);
>
> + /* MWAIT is not supported in TDX platform, so suppress it */
> + setup_clear_cpu_cap(X86_FEATURE_MWAIT);

In fact, MWAIT bit returned by CPUID instruction is zero for TD guest.
This is enforced by SEAM module.

Do we still need to safeguard it by setup_clear_cpu_cap() here?

> +
> tdg_get_info();
>
> pv_ops.irq.safe_halt = tdg_safe_halt;


2021-03-30 15:03:02

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH v1 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

On Tue, Mar 30, 2021 at 12:56:41PM +0800, Xiaoyao Li wrote:
> On 3/27/2021 8:18 AM, Kuppuswamy Sathyanarayanan wrote:
> > In non-root TDX guest mode, MWAIT, MONITOR and WBINVD instructions
> > are not supported. So handle #VE due to these instructions as no ops.
> >
> > Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]>
> > Reviewed-by: Andi Kleen <[email protected]>
> > ---
> >
> > Changes since previous series:
> > * Suppressed MWAIT feature as per Andi's comment.
> > * Added warning debug log for MWAIT #VE exception.
> >
> > arch/x86/kernel/tdx.c | 23 +++++++++++++++++++++++
> > 1 file changed, 23 insertions(+)
> >
> > diff --git a/arch/x86/kernel/tdx.c b/arch/x86/kernel/tdx.c
> > index e936b2f88bf6..fb7d22b846fc 100644
> > --- a/arch/x86/kernel/tdx.c
> > +++ b/arch/x86/kernel/tdx.c
> > @@ -308,6 +308,9 @@ void __init tdx_early_init(void)
> > setup_force_cpu_cap(X86_FEATURE_TDX_GUEST);
> > + /* MWAIT is not supported in TDX platform, so suppress it */
> > + setup_clear_cpu_cap(X86_FEATURE_MWAIT);
>
> In fact, MWAIT bit returned by CPUID instruction is zero for TD guest. This
> is enforced by SEAM module.

Good point.
>
> Do we still need to safeguard it by setup_clear_cpu_cap() here?

I guess it doesn't hurt to do it explicitly.


-Andi

2021-03-30 15:12:08

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v1 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

On 3/30/21 8:00 AM, Andi Kleen wrote:
>>> + /* MWAIT is not supported in TDX platform, so suppress it */
>>> + setup_clear_cpu_cap(X86_FEATURE_MWAIT);
>> In fact, MWAIT bit returned by CPUID instruction is zero for TD guest. This
>> is enforced by SEAM module.
> Good point.
>> Do we still need to safeguard it by setup_clear_cpu_cap() here?
> I guess it doesn't hurt to do it explicitly.

If this MWAIT behavior (clearing the CPUID bit) is part of the guest
architecture, then it would also be appropriate to WARN() rather than
silently clearing the X86_FEATURE bit.

2021-03-30 15:18:12

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

On Mon, Mar 29, 2021, Andy Lutomirski wrote:
>
> > On Mar 29, 2021, at 7:04 PM, Andi Kleen <[email protected]> wrote:
> >
> > 
> >>
> >>> No, if these instructions take a #VE then they were executed at CPL=0. MONITOR
> >>> and MWAIT will #UD without VM-Exit->#VE. Same for WBINVD, s/#UD/#GP.
> >>
> >> Dare I ask about XSETBV?
> >
> > XGETBV does not cause a #VE, it just works normally. The guest has full
> > AVX capabilities.
> >
>
> X *SET* BV

Heh, XSETBV also works normally, relative to the features enumerated in CPUID.
XSAVES/XRSTORS support is fixed to '1' in the virtual CPU model. A subset of
the features managed by XSAVE can be hidden by the VMM, but attempting to enable
unsupported features will #GP (either from hardware or injected by TDX Module),
not #VE.

2021-03-30 16:39:35

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD


> On Mar 30, 2021, at 8:14 AM, Sean Christopherson <[email protected]> wrote:
>
> On Mon, Mar 29, 2021, Andy Lutomirski wrote:
>>
>>>> On Mar 29, 2021, at 7:04 PM, Andi Kleen <[email protected]> wrote:
>>>
>>> 
>>>>
>>>>> No, if these instructions take a #VE then they were executed at CPL=0. MONITOR
>>>>> and MWAIT will #UD without VM-Exit->#VE. Same for WBINVD, s/#UD/#GP.
>>>>
>>>> Dare I ask about XSETBV?
>>>
>>> XGETBV does not cause a #VE, it just works normally. The guest has full
>>> AVX capabilities.
>>>
>>
>> X *SET* BV
>
> Heh, XSETBV also works normally, relative to the features enumerated in CPUID.
> XSAVES/XRSTORS support is fixed to '1' in the virtual CPU model. A subset of
> the features managed by XSAVE can be hidden by the VMM, but attempting to enable
> unsupported features will #GP (either from hardware or injected by TDX Module),
> not #VE.

Normally in non-root mode means that every XSETBV results in a VM exit and, IIUC, there’s a buglet in that this happens even if CPL==3. Does something special happen in TDX or does the exit get reflected back to the guest as a #VE?

2021-03-30 16:59:32

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

On Tue, Mar 30, 2021, Andy Lutomirski wrote:
>
> > On Mar 30, 2021, at 8:14 AM, Sean Christopherson <[email protected]> wrote:
> >
> > On Mon, Mar 29, 2021, Andy Lutomirski wrote:
> >>
> >>>> On Mar 29, 2021, at 7:04 PM, Andi Kleen <[email protected]> wrote:
> >>>
> >>> 
> >>>>
> >>>>> No, if these instructions take a #VE then they were executed at CPL=0. MONITOR
> >>>>> and MWAIT will #UD without VM-Exit->#VE. Same for WBINVD, s/#UD/#GP.
> >>>>
> >>>> Dare I ask about XSETBV?
> >>>
> >>> XGETBV does not cause a #VE, it just works normally. The guest has full
> >>> AVX capabilities.
> >>>
> >>
> >> X *SET* BV
> >
> > Heh, XSETBV also works normally, relative to the features enumerated in CPUID.
> > XSAVES/XRSTORS support is fixed to '1' in the virtual CPU model. A subset of
> > the features managed by XSAVE can be hidden by the VMM, but attempting to enable
> > unsupported features will #GP (either from hardware or injected by TDX Module),
> > not #VE.
>
> Normally in non-root mode means that every XSETBV results in a VM exit and,
> IIUC, there’s a buglet in that this happens even if CPL==3. Does something
> special happen in TDX or does the exit get reflected back to the guest as a
> #VE?

Hmm, I forgot about that quirk. I would expect the TDX Module to inject a #GP
for that case. I can't find anything in the spec that confirms or denies that,
but injecting #VE would be weird and pointless.

Andi/Sathya, the TDX Module spec should be updated to state that XSETBV will
#GP at CPL!=0. If that's not already the behavior, the module should probably
be changed...

Subject: Re: [PATCH v1 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD



On 3/30/21 8:10 AM, Dave Hansen wrote:
> On 3/30/21 8:00 AM, Andi Kleen wrote:
>>>> + /* MWAIT is not supported in TDX platform, so suppress it */
>>>> + setup_clear_cpu_cap(X86_FEATURE_MWAIT);
>>> In fact, MWAIT bit returned by CPUID instruction is zero for TD guest. This
>>> is enforced by SEAM module.
>> Good point.
>>> Do we still need to safeguard it by setup_clear_cpu_cap() here?
>> I guess it doesn't hurt to do it explicitly.
>
> If this MWAIT behavior (clearing the CPUID bit) is part of the guest
> architecture, then it would also be appropriate to WARN() rather than
> silently clearing the X86_FEATURE bit.
Makes sense. It will be useful to find the problem with TDX Module.
>

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

Subject: [PATCH v4 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

As per Guest-Host Communication Interface (GHCI) Specification
for Intel TDX, sec 2.4.1, TDX architecture does not support
MWAIT, MONITOR and WBINVD instructions. So in non-root TDX mode,
if MWAIT/MONITOR instructions are executed with CPL != 0 it will
trigger #UD, and for CPL = 0 case, virtual exception (#VE) is
triggered. WBINVD instruction behavior is also similar to
MWAIT/MONITOR, but for CPL != 0 case, it will trigger #GP instead
of #UD.

To prevent TD guest from using these unsupported instructions,
following measures are adapted:

1. For MWAIT/MONITOR instructions, support for these instructions
are already disabled by TDX module (SEAM). So CPUID flags for
these instructions should be in disabled state. Also, just to be
sure that these instructions are disabled, forcefully unset
X86_FEATURE_MWAIT CPU cap in OS.

2. For WBINVD instruction, we use audit to find the code that uses
this instruction and disable them for TD.

After the above mentioned preventive measures, if TD guest still
execute these instructions, add appropriate warning messages in #VE
handler.

Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
---

Changes since v3:
* WARN user if SEAM does not disable MONITOR/MWAIT instruction.
* Fix the commit log and comments to address review comments from
from Dave & Sean.

Changes since v2:
* Added BUG() for WBINVD, WARN for MONITOR instructions.
* Fixed comments as per Dave's review.

Changes since v1:
* Added WARN() for MWAIT #VE exception.

Changes since previous series:
* Suppressed MWAIT feature as per Andi's comment.
* Added warning debug log for MWAIT #VE exception.

arch/x86/kernel/tdx.c | 44 ++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 43 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/tdx.c b/arch/x86/kernel/tdx.c
index e936b2f88bf6..82b411b828a5 100644
--- a/arch/x86/kernel/tdx.c
+++ b/arch/x86/kernel/tdx.c
@@ -63,6 +63,14 @@ static inline bool cpuid_has_tdx_guest(void)
return true;
}

+static inline bool cpuid_has_mwait(void)
+{
+ if (cpuid_ecx(1) & (1 << (X86_FEATURE_MWAIT % 32)))
+ return true;
+
+ return false;
+}
+
bool is_tdx_guest(void)
{
return static_cpu_has(X86_FEATURE_TDX_GUEST);
@@ -301,12 +309,25 @@ static int tdg_handle_mmio(struct pt_regs *regs, struct ve_info *ve)
return insn.length;
}

+/* Initialize TDX specific CPU capabilities */
+static void __init tdx_cpu_cap_init(void)
+{
+ setup_force_cpu_cap(X86_FEATURE_TDX_GUEST);
+
+ if (cpuid_has_mwait()) {
+ WARN(1, "TDX Module failed to disable MWAIT\n");
+ /* MWAIT is not supported in TDX platform, so suppress it */
+ setup_clear_cpu_cap(X86_FEATURE_MWAIT);
+ }
+
+}
+
void __init tdx_early_init(void)
{
if (!cpuid_has_tdx_guest())
return;

- setup_force_cpu_cap(X86_FEATURE_TDX_GUEST);
+ tdx_cpu_cap_init();

tdg_get_info();

@@ -362,6 +383,27 @@ int tdg_handle_virtualization_exception(struct pt_regs *regs,
case EXIT_REASON_EPT_VIOLATION:
ve->instr_len = tdg_handle_mmio(regs, ve);
break;
+ case EXIT_REASON_WBINVD:
+ /*
+ * TDX architecture does not support WBINVD instruction.
+ * Currently, usage of this instruction is prevented by
+ * disabling the drivers which uses it. So if we still
+ * reach here, it needs user attention.
+ */
+ pr_err("TD Guest used unsupported WBINVD instruction\n");
+ BUG();
+ break;
+ case EXIT_REASON_MONITOR_INSTRUCTION:
+ case EXIT_REASON_MWAIT_INSTRUCTION:
+ /*
+ * MWAIT/MONITOR features are disabled by TDX Module (SEAM)
+ * and also re-suppressed in kernel by clearing
+ * X86_FEATURE_MWAIT CPU feature flag in tdx_early_init(). So
+ * if TD guest still executes MWAIT/MONITOR instruction with
+ * above suppression, it needs user attention.
+ */
+ WARN(1, "TD Guest used unsupported MWAIT/MONITOR instruction\n");
+ break;
default:
pr_warn("Unexpected #VE: %d\n", ve->exit_reason);
return -EFAULT;
--
2.25.1

2021-03-31 21:55:52

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v4 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

On Wed, Mar 31, 2021, Kuppuswamy Sathyanarayanan wrote:
> Changes since v3:
> * WARN user if SEAM does not disable MONITOR/MWAIT instruction.

Why bother? There are a whole pile of features that are dictated by the TDX
module spec. MONITOR/MWAIT is about as uninteresting as it gets, e.g. absolute
worst case scenario is the guest kernel crashes, whereas a lot of spec violations
would compromise the security of the guest.

> + case EXIT_REASON_MONITOR_INSTRUCTION:
> + case EXIT_REASON_MWAIT_INSTRUCTION:
> + /*
> + * MWAIT/MONITOR features are disabled by TDX Module (SEAM)
> + * and also re-suppressed in kernel by clearing
> + * X86_FEATURE_MWAIT CPU feature flag in tdx_early_init(). So
> + * if TD guest still executes MWAIT/MONITOR instruction with
> + * above suppression, it needs user attention.
> + */
> + WARN(1, "TD Guest used unsupported MWAIT/MONITOR instruction\n");

Why not just WARN_ONCE and call it good?

> + break;
> default:
> pr_warn("Unexpected #VE: %d\n", ve->exit_reason);
> return -EFAULT;
> --
> 2.25.1
>

2021-03-31 21:55:53

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v4 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

On 3/31/21 2:09 PM, Kuppuswamy Sathyanarayanan wrote:
> As per Guest-Host Communication Interface (GHCI) Specification
> for Intel TDX, sec 2.4.1, TDX architecture does not support
> MWAIT, MONITOR and WBINVD instructions. So in non-root TDX mode,
> if MWAIT/MONITOR instructions are executed with CPL != 0 it will
> trigger #UD, and for CPL = 0 case, virtual exception (#VE) is
> triggered. WBINVD instruction behavior is also similar to
> MWAIT/MONITOR, but for CPL != 0 case, it will trigger #GP instead
> of #UD.

Could we give it a go to try this in plain English before jumping in and
quoting the exact spec section? Also, the CPL language is nice and
precise for talking inside Intel, but it's generally easier for me to
read kernel descriptions when we just talk about the kernel.

When running as a TDX guest, there are a number of existing,
privileged instructions that do not work. If the guest kernel
uses these instructions, the hardware generates a #VE.

Which reminds me... The SDM says: MWAIT will "#UD ... If
CPUID.01H:ECX.MONITOR[bit 3] = 0". So, is this an architectural change?
The guest is *supposed* to see that CPUID bit as 0, so shouldn't it
also get a #UD? Or is this all so that if SEAM *forgets* to clear the
CPUID bit, the guest gets #VE?

What are we *actually* mitigating here?

Also, FWIW, MWAIT/MONITOR and WBINVD are pretty different beasts. I
think this would all have been a lot more clear if this would have been
two patches instead of shoehorning them into one.

> To prevent TD guest from using these unsupported instructions,
> following measures are adapted:
>
> 1. For MWAIT/MONITOR instructions, support for these instructions
> are already disabled by TDX module (SEAM). So CPUID flags for
> these instructions should be in disabled state. Also, just to be
> sure that these instructions are disabled, forcefully unset
> X86_FEATURE_MWAIT CPU cap in OS.
>
> 2. For WBINVD instruction, we use audit to find the code that uses
> this instruction and disable them for TD.

Really? Where are those patches?

> +static inline bool cpuid_has_mwait(void)
> +{
> + if (cpuid_ecx(1) & (1 << (X86_FEATURE_MWAIT % 32)))
> + return true;
> +
> + return false;
> +}
> +
> bool is_tdx_guest(void)
> {
> return static_cpu_has(X86_FEATURE_TDX_GUEST);
> @@ -301,12 +309,25 @@ static int tdg_handle_mmio(struct pt_regs *regs, struct ve_info *ve)
> return insn.length;
> }
>
> +/* Initialize TDX specific CPU capabilities */
> +static void __init tdx_cpu_cap_init(void)
> +{
> + setup_force_cpu_cap(X86_FEATURE_TDX_GUEST);
> +
> + if (cpuid_has_mwait()) {
> + WARN(1, "TDX Module failed to disable MWAIT\n");

WARN(1, "TDX guest enumerated support for MWAIT, disabling it").

> + /* MWAIT is not supported in TDX platform, so suppress it */
> + setup_clear_cpu_cap(X86_FEATURE_MWAIT);
> + }
> +
> +}

Extra newline.

> void __init tdx_early_init(void)
> {
> if (!cpuid_has_tdx_guest())
> return;
>
> - setup_force_cpu_cap(X86_FEATURE_TDX_GUEST);
> + tdx_cpu_cap_init();
>
> tdg_get_info();
>
> @@ -362,6 +383,27 @@ int tdg_handle_virtualization_exception(struct pt_regs *regs,
> case EXIT_REASON_EPT_VIOLATION:
> ve->instr_len = tdg_handle_mmio(regs, ve);
> break;
> + case EXIT_REASON_WBINVD:
> + /*
> + * TDX architecture does not support WBINVD instruction.
> + * Currently, usage of this instruction is prevented by
> + * disabling the drivers which uses it. So if we still
> + * reach here, it needs user attention.
> + */

This comment is awfully vague. "TDX architecture..." what? Any CPUs
supporting the TDX architecture? TDX VMM's? TDX Guests?

Let's also not waste byte on stating the obvious. If it didn't need
attention we wouldn't be warning about it, eh?

So, let's halve the size of the comment and say:

/*
* WBINVD is not supported inside TDX guests. All in-
* kernel uses should have been disabled.
*/

> + pr_err("TD Guest used unsupported WBINVD instruction\n");
> + BUG();
> + break;
> + case EXIT_REASON_MONITOR_INSTRUCTION:
> + case EXIT_REASON_MWAIT_INSTRUCTION:
> + /*
> + * MWAIT/MONITOR features are disabled by TDX Module (SEAM)
> + * and also re-suppressed in kernel by clearing
> + * X86_FEATURE_MWAIT CPU feature flag in tdx_early_init(). So
> + * if TD guest still executes MWAIT/MONITOR instruction with
> + * above suppression, it needs user attention.
> + */

Again, let's trim this down:

/*
* Something in the kernel used MONITOR or MWAIT despite
* X86_FEATURE_MWAIT being cleared for TDX guests.
*/

Rather than naming the function, this makes it quite greppable to find
where it could have *possibly* been cleared.

> + WARN(1, "TD Guest used unsupported MWAIT/MONITOR instruction\n");
> + break;
> default:
> pr_warn("Unexpected #VE: %d\n", ve->exit_reason);
> return -EFAULT;
>

2021-03-31 22:02:36

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v4 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

On 3/31/21 2:53 PM, Sean Christopherson wrote:
> On Wed, Mar 31, 2021, Kuppuswamy Sathyanarayanan wrote:
>> Changes since v3:
>> * WARN user if SEAM does not disable MONITOR/MWAIT instruction.
> Why bother? There are a whole pile of features that are dictated by the TDX
> module spec. MONITOR/MWAIT is about as uninteresting as it gets, e.g. absolute
> worst case scenario is the guest kernel crashes, whereas a lot of spec violations
> would compromise the security of the guest.

So, what should we do? In the #VE handler:

switch (exit_reason) {
case SOMETHING_WE_HANDLE:
blah();
break;
...
default:
pr_err("unhadled #VE, exit reason: %d\n", exit_reason);
BUG_ON(1);
}

?

Is this the *ONLY* one of these, or are we going to have another twenty?

If this is the only one, we might as well give a nice string error
message. If there are twenty more, let's just dump the exit reason,
BUG() and move on with our lives.

2021-03-31 22:08:55

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v4 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

On Wed, Mar 31, 2021, Dave Hansen wrote:
> On 3/31/21 2:53 PM, Sean Christopherson wrote:
> > On Wed, Mar 31, 2021, Kuppuswamy Sathyanarayanan wrote:
> >> Changes since v3:
> >> * WARN user if SEAM does not disable MONITOR/MWAIT instruction.
> > Why bother? There are a whole pile of features that are dictated by the TDX
> > module spec. MONITOR/MWAIT is about as uninteresting as it gets, e.g. absolute
> > worst case scenario is the guest kernel crashes, whereas a lot of spec violations
> > would compromise the security of the guest.
>
> So, what should we do? In the #VE handler:
>
> switch (exit_reason) {
> case SOMETHING_WE_HANDLE:
> blah();
> break;
> ...
> default:
> pr_err("unhadled #VE, exit reason: %d\n", exit_reason);
> BUG_ON(1);
> }
>
> ?
>
> Is this the *ONLY* one of these, or are we going to have another twenty?
>
> If this is the only one, we might as well give a nice string error
> message. If there are twenty more, let's just dump the exit reason,
> BUG() and move on with our lives.

I've no objection to a nice message in the #VE handler. What I'm objecting to
is sanity checking the CPUID model provided by the TDX module. If we don't
trust the TDX module to honor the spec, then there are a huge pile of things
that are far higher priority than MONITOR/MWAIT.

2021-03-31 22:14:40

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v4 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

On 3/31/21 3:06 PM, Sean Christopherson wrote:
> I've no objection to a nice message in the #VE handler. What I'm objecting to
> is sanity checking the CPUID model provided by the TDX module. If we don't
> trust the TDX module to honor the spec, then there are a huge pile of things
> that are far higher priority than MONITOR/MWAIT.

In other words: Don't muck with CPUID or the X86_FEATURE at all. Don't
check it to comply with the spec. If something doesn't comply, we'll
get a #VE at *SOME* point. We don't need to do belt-and-suspenders
programming here.

That sounds sane to me.

Subject: Re: [PATCH v4 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD



On 3/31/21 2:49 PM, Dave Hansen wrote:
> On 3/31/21 2:09 PM, Kuppuswamy Sathyanarayanan wrote:
>> As per Guest-Host Communication Interface (GHCI) Specification
>> for Intel TDX, sec 2.4.1, TDX architecture does not support
>> MWAIT, MONITOR and WBINVD instructions. So in non-root TDX mode,
>> if MWAIT/MONITOR instructions are executed with CPL != 0 it will
>> trigger #UD, and for CPL = 0 case, virtual exception (#VE) is
>> triggered. WBINVD instruction behavior is also similar to
>> MWAIT/MONITOR, but for CPL != 0 case, it will trigger #GP instead
>> of #UD.
>
> Could we give it a go to try this in plain English before jumping in and
> quoting the exact spec section? Also, the CPL language is nice and
> precise for talking inside Intel, but it's generally easier for me to
> read kernel descriptions when we just talk about the kernel.
>
> When running as a TDX guest, there are a number of existing,
> privileged instructions that do not work. If the guest kernel
> uses these instructions, the hardware generates a #VE.
I will fix it in next version.
>
> Which reminds me... The SDM says: MWAIT will "#UD ... If
> CPUID.01H:ECX.MONITOR[bit 3] = 0". So, is this an architectural change?
> The guest is *supposed* to see that CPUID bit as 0, so shouldn't it
> also get a #UD? Or is this all so that if SEAM *forgets* to clear the
> CPUID bit, the guest gets #VE?
AFAIK, we are only concerned about the case where the instruction support
is not disabled by SEAM. For disabled case, it should get #UD.
Sean, can you confirm it?
>
> What are we *actually* mitigating here?
we add support for #VE, when executing un-supported instruction in TD guest
kernel.
>
> Also, FWIW, MWAIT/MONITOR and WBINVD are pretty different beasts. I
> think this would all have been a lot more clear if this would have been
> two patches instead of shoehorning them into one.
Since all of them are unsupported instructions, I have grouped them
together. Even if we split it, there should be some duplication in commit
log (since handling is similar). But let me know if this is a desired
approach. I can split it in two patches.
>
>> To prevent TD guest from using these unsupported instructions,
>> following measures are adapted:
>>
>> 1. For MWAIT/MONITOR instructions, support for these instructions
>> are already disabled by TDX module (SEAM). So CPUID flags for
>> these instructions should be in disabled state. Also, just to be
>> sure that these instructions are disabled, forcefully unset
>> X86_FEATURE_MWAIT CPU cap in OS.
>>
>> 2. For WBINVD instruction, we use audit to find the code that uses
>> this instruction and disable them for TD.
>
> Really? Where are those patches?
For MWAIT/MONITOR, the change is included in the same patch.
For WBINVD, we have will have some patches included in next
series.
>
>> +static inline bool cpuid_has_mwait(void)
>> +{
>> + if (cpuid_ecx(1) & (1 << (X86_FEATURE_MWAIT % 32)))
>> + return true;
>> +
>> + return false;
>> +}
>> +
>> bool is_tdx_guest(void)
>> {
>> return static_cpu_has(X86_FEATURE_TDX_GUEST);
>> @@ -301,12 +309,25 @@ static int tdg_handle_mmio(struct pt_regs *regs, struct ve_info *ve)
>> return insn.length;
>> }
>>
>> +/* Initialize TDX specific CPU capabilities */
>> +static void __init tdx_cpu_cap_init(void)
>> +{
>> + setup_force_cpu_cap(X86_FEATURE_TDX_GUEST);
>> +
>> + if (cpuid_has_mwait()) {
>> + WARN(1, "TDX Module failed to disable MWAIT\n");
>
> WARN(1, "TDX guest enumerated support for MWAIT, disabling it").
will fix it in next version.
>
>> + /* MWAIT is not supported in TDX platform, so suppress it */
>> + setup_clear_cpu_cap(X86_FEATURE_MWAIT);
>> + }
>> +
>> +}
>
> Extra newline.
>
>> void __init tdx_early_init(void)
>> {
>> if (!cpuid_has_tdx_guest())
>> return;
>>
>> - setup_force_cpu_cap(X86_FEATURE_TDX_GUEST);
>> + tdx_cpu_cap_init();
>>
>> tdg_get_info();
>>
>> @@ -362,6 +383,27 @@ int tdg_handle_virtualization_exception(struct pt_regs *regs,
>> case EXIT_REASON_EPT_VIOLATION:
>> ve->instr_len = tdg_handle_mmio(regs, ve);
>> break;
>> + case EXIT_REASON_WBINVD:
>> + /*
>> + * TDX architecture does not support WBINVD instruction.
>> + * Currently, usage of this instruction is prevented by
>> + * disabling the drivers which uses it. So if we still
>> + * reach here, it needs user attention.
>> + */
>
> This comment is awfully vague. "TDX architecture..." what? Any CPUs
> supporting the TDX architecture? TDX VMM's? TDX Guests?
>
> Let's also not waste byte on stating the obvious. If it didn't need
> attention we wouldn't be warning about it, eh?
>
> So, let's halve the size of the comment and say:
>
> /*
> * WBINVD is not supported inside TDX guests. All in-
> * kernel uses should have been disabled.
> */
ok. will fix it next version.
>
>> + pr_err("TD Guest used unsupported WBINVD instruction\n");
>> + BUG();
>> + break;
>> + case EXIT_REASON_MONITOR_INSTRUCTION:
>> + case EXIT_REASON_MWAIT_INSTRUCTION:
>> + /*
>> + * MWAIT/MONITOR features are disabled by TDX Module (SEAM)
>> + * and also re-suppressed in kernel by clearing
>> + * X86_FEATURE_MWAIT CPU feature flag in tdx_early_init(). So
>> + * if TD guest still executes MWAIT/MONITOR instruction with
>> + * above suppression, it needs user attention.
>> + */
>
> Again, let's trim this down:
>
> /*
> * Something in the kernel used MONITOR or MWAIT despite
> * X86_FEATURE_MWAIT being cleared for TDX guests.
> */
will fix it next version.
>
> Rather than naming the function, this makes it quite greppable to find
> where it could have *possibly* been cleared.
>
>> + WARN(1, "TD Guest used unsupported MWAIT/MONITOR instruction\n");
I think WARN_ONCE is good enough for this exception. Do you agree?
>> + break;
>> default:
>> pr_warn("Unexpected #VE: %d\n", ve->exit_reason);
>> return -EFAULT;
>>
>

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

Subject: Re: [PATCH v4 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD



On 3/31/21 3:11 PM, Dave Hansen wrote:
> On 3/31/21 3:06 PM, Sean Christopherson wrote:
>> I've no objection to a nice message in the #VE handler. What I'm objecting to
>> is sanity checking the CPUID model provided by the TDX module. If we don't
>> trust the TDX module to honor the spec, then there are a huge pile of things
>> that are far higher priority than MONITOR/MWAIT.
>
> In other words: Don't muck with CPUID or the X86_FEATURE at all. Don't
> check it to comply with the spec. If something doesn't comply, we'll
> get a #VE at *SOME* point. We don't need to do belt-and-suspenders
> programming here.
>
> That sounds sane to me.
But I think there are cases (like MCE) where SEAM does not disable them because
there will be future support for it. We should at-least suppress such features
in kernel.
>

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

2021-03-31 22:34:39

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v4 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

On Wed, Mar 31, 2021, Kuppuswamy, Sathyanarayanan wrote:
>
> On 3/31/21 3:11 PM, Dave Hansen wrote:
> > On 3/31/21 3:06 PM, Sean Christopherson wrote:
> > > I've no objection to a nice message in the #VE handler. What I'm objecting to
> > > is sanity checking the CPUID model provided by the TDX module. If we don't
> > > trust the TDX module to honor the spec, then there are a huge pile of things
> > > that are far higher priority than MONITOR/MWAIT.
> >
> > In other words: Don't muck with CPUID or the X86_FEATURE at all. Don't
> > check it to comply with the spec. If something doesn't comply, we'll
> > get a #VE at *SOME* point. We don't need to do belt-and-suspenders
> > programming here.
> >
> > That sounds sane to me.
> But I think there are cases (like MCE) where SEAM does not disable them because
> there will be future support for it. We should at-least suppress such features
> in kernel.

MCE is a terrible example, because the TDX behavior for MCE is terrible.
Enumerating MCE as supported but injecting a #GP if the guest attempts to set
CR4.MCE=1 is awful. I'm all for treating that as a one-off case, with a very
derogatory comment :-)

2021-03-31 22:36:30

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v4 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

On 3/31/21 3:28 PM, Kuppuswamy, Sathyanarayanan wrote:
>
> On 3/31/21 3:11 PM, Dave Hansen wrote:
>> On 3/31/21 3:06 PM, Sean Christopherson wrote:
>>> I've no objection to a nice message in the #VE handler.  What I'm
>>> objecting to
>>> is sanity checking the CPUID model provided by the TDX module.  If we
>>> don't
>>> trust the TDX module to honor the spec, then there are a huge pile of
>>> things
>>> that are far higher priority than MONITOR/MWAIT.
>>
>> In other words:  Don't muck with CPUID or the X86_FEATURE at all.  Don't
>> check it to comply with the spec.  If something doesn't comply, we'll
>> get a #VE at *SOME* point.  We don't need to do belt-and-suspenders
>> programming here.
>>
>> That sounds sane to me.
> But I think there are cases (like MCE) where SEAM does not disable
> them because there will be future support for it. We should at-least
> suppress such features in kernel.

Specifics, please.

The hardware (and VMMs and SEAM) have ways of telling the guest kernel
what is supported: CPUID. If it screws up, and the guest gets an
unexpected #VE, so be it.

We don't have all kinds of crazy handling in the kernel's #UD handler
just in case a CPU mis-enumerates a feature and we get a #UD. We have
to trust the underlying hardware to be sane. If it isn't, we die a
horrible death as fast as possible. Why should TDX be any different?

2021-04-01 03:30:56

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH v4 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

> The hardware (and VMMs and SEAM) have ways of telling the guest kernel
> what is supported: CPUID. If it screws up, and the guest gets an
> unexpected #VE, so be it.

The main reason for disabling stuff is actually that we don't need
to harden it. All these things are potential attack paths.

>
> We don't have all kinds of crazy handling in the kernel's #UD handler
> just in case a CPU mis-enumerates a feature and we get a #UD. We have
> to trust the underlying hardware to be sane. If it isn't, we die a
> horrible death as fast as possible. Why should TDX be any different?

That's what the original patch did -- no unnecessary checks -- but reviewers
keep asking for the extra checks, so Sathya added more. We have the not
unusual problem here that reviewers don't agree among themselves.

-Andi

2021-04-01 03:48:43

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v4 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

On 3/31/21 8:28 PM, Andi Kleen wrote:
>> The hardware (and VMMs and SEAM) have ways of telling the guest kernel
>> what is supported: CPUID. If it screws up, and the guest gets an
>> unexpected #VE, so be it.
> The main reason for disabling stuff is actually that we don't need
> to harden it. All these things are potential attack paths.

Wait, MWAIT is an attack path? If it were an attack path, wouldn't it
be an attack path that was created from the SEAM layer or the hardware
being broken? Aren't those two things within the trust boundary? Do we
harden against other things within the trust boundary?

>> We don't have all kinds of crazy handling in the kernel's #UD handler
>> just in case a CPU mis-enumerates a feature and we get a #UD. We have
>> to trust the underlying hardware to be sane. If it isn't, we die a
>> horrible death as fast as possible. Why should TDX be any different?
> That's what the original patch did -- no unnecessary checks -- but reviewers
> keep asking for the extra checks, so Sathya added more. We have the not
> unusual problem here that reviewers don't agree among themselves.

Getting consensus is a pain in the neck, eh?

It's too bad all the reviewers in the community aren't like all of the
engineers at big companies where everyone always agrees. :)

2021-04-01 04:26:27

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH v4 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

On Wed, Mar 31, 2021 at 08:46:18PM -0700, Dave Hansen wrote:
> On 3/31/21 8:28 PM, Andi Kleen wrote:
> >> The hardware (and VMMs and SEAM) have ways of telling the guest kernel
> >> what is supported: CPUID. If it screws up, and the guest gets an
> >> unexpected #VE, so be it.
> > The main reason for disabling stuff is actually that we don't need
> > to harden it. All these things are potential attack paths.
>
> Wait, MWAIT is an attack path? If it were an attack path, wouldn't it

No MWAIT is not, but lots of other things that can be controlled by the
host are. And that will be a motivation to disable things.

> >> We don't have all kinds of crazy handling in the kernel's #UD handler
> >> just in case a CPU mis-enumerates a feature and we get a #UD. We have
> >> to trust the underlying hardware to be sane. If it isn't, we die a
> >> horrible death as fast as possible. Why should TDX be any different?
> > That's what the original patch did -- no unnecessary checks -- but reviewers
> > keep asking for the extra checks, so Sathya added more. We have the not
> > unusual problem here that reviewers don't agree among themselves.
>
> Getting consensus is a pain in the neck, eh?

Tt seems more like a circular argument currently.
>
> It's too bad all the reviewers in the community aren't like all of the
> engineers at big companies where everyone always agrees. :)

I would propose to go back to the original patch without all the extra
checks. I think that's what you're arguing too. IIRC the person
who originally requested extra checks was Andy, if he's ok with
that too we can do it, so that you guys can finally move on
to the other patches that actually do more than just trivial things.

-Andi

Subject: [PATCH v5 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

When running as a TDX guest, there are a number of existing,
privileged instructions that do not work. If the guest kernel
uses these instructions, the hardware generates a #VE.

You can find the list of unsupported instructions in Intel
Trust Domain Extensions (Intel® TDX) Module specification,
sec 9.2.2 and in Guest-Host Communication Interface (GHCI)
Specification for Intel TDX, sec 2.4.1.
   
To prevent TD guest from using these unsupported instructions,
following measures are adapted:
   
1. For MWAIT/MONITOR instructions, support for these
instructions are already disabled by TDX module (SEAM).
So CPUID flags for these instructions should be in disabled
state. Also, just to be sure that these instructions are
disabled, forcefully unset X86_FEATURE_MWAIT CPU cap in OS.
       
2. For WBINVD instruction, we use audit to find the code that
uses this instruction and disable them for TD.

After the above mentioned preventive measures, if TD guests still
execute these instructions, add appropriate warning messages in #VE
handler.

Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
---

Changes since v4:
* Fixed commit log and comments as per Dave's comments
* Used WARN_ONCE for MWAIT/MONITOR #VE.
* Removed X86_FEATURE_MWAIT suppression code.

Changes since v3:
* WARN user if SEAM does not disable MONITOR/MWAIT instruction.
* Fix the commit log and comments to address review comments from
from Dave & Sean.

Changes since v2:
* Added BUG() for WBINVD, WARN for MONITOR instructions.
* Fixed comments as per Dave's review.

Changes since v1:
* Added WARN() for MWAIT #VE exception.

Changes since previous series:
* Suppressed MWAIT feature as per Andi's comment.
* Added warning debug log for MWAIT #VE exception.

arch/x86/kernel/tdx.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

diff --git a/arch/x86/kernel/tdx.c b/arch/x86/kernel/tdx.c
index e936b2f88bf6..9bc84caf4096 100644
--- a/arch/x86/kernel/tdx.c
+++ b/arch/x86/kernel/tdx.c
@@ -362,6 +362,22 @@ int tdg_handle_virtualization_exception(struct pt_regs *regs,
case EXIT_REASON_EPT_VIOLATION:
ve->instr_len = tdg_handle_mmio(regs, ve);
break;
+ case EXIT_REASON_WBINVD:
+ /*
+ * WBINVD is not supported inside TDX guests. All in-
+ * kernel uses should have been disabled.
+ */
+ pr_err("TD Guest used unsupported WBINVD instruction\n");
+ BUG();
+ break;
+ case EXIT_REASON_MONITOR_INSTRUCTION:
+ case EXIT_REASON_MWAIT_INSTRUCTION:
+ /*
+ * Something in the kernel used MONITOR or MWAIT despite
+ * X86_FEATURE_MWAIT being cleared for TDX guests.
+ */
+ WARN_ONCE(1, "TD Guest used unsupported MWAIT/MONITOR instruction\n");
+ break;
default:
pr_warn("Unexpected #VE: %d\n", ve->exit_reason);
return -EFAULT;
--
2.25.1

2021-04-07 21:15:51

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

> Hmm, I forgot about that quirk. I would expect the TDX Module to inject a #GP
> for that case. I can't find anything in the spec that confirms or denies that,
> but injecting #VE would be weird and pointless.
>
> Andi/Sathya, the TDX Module spec should be updated to state that XSETBV will
> #GP at CPL!=0. If that's not already the behavior, the module should probably
> be changed...

I asked about this and the answer was that XSETBV behaves architecturally inside
a TD (no #VE), thus there is nothing to document.

-Andi