2020-03-19 09:18:32

by Joerg Roedel

[permalink] [raw]
Subject: [PATCH 41/70] x86/sev-es: Add Runtime #VC Exception Handler

From: Tom Lendacky <[email protected]>

Add the handler for #VC exceptions invoked at runtime.

Signed-off-by: Tom Lendacky <[email protected]>
Signed-off-by: Joerg Roedel <[email protected]>
---
arch/x86/entry/entry_64.S | 4 ++
arch/x86/include/asm/traps.h | 7 ++++
arch/x86/kernel/idt.c | 4 +-
arch/x86/kernel/sev-es.c | 77 +++++++++++++++++++++++++++++++++++-
4 files changed, 90 insertions(+), 2 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index f2bb91e87877..729876d368c5 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1210,6 +1210,10 @@ idtentry async_page_fault do_async_page_fault has_error_code=1 read_cr2=1
idtentry machine_check do_mce has_error_code=0 paranoid=1
#endif

+#ifdef CONFIG_AMD_MEM_ENCRYPT
+idtentry vmm_communication do_vmm_communication has_error_code=1
+#endif
+
/*
* Save all registers in pt_regs, and switch gs if needed.
* Use slow, but surefire "are we in kernel?" check.
diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 2aa786484bb1..1be25c065698 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -35,6 +35,9 @@ asmlinkage void alignment_check(void);
#ifdef CONFIG_X86_MCE
asmlinkage void machine_check(void);
#endif /* CONFIG_X86_MCE */
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+asmlinkage void vmm_communication(void);
+#endif
asmlinkage void simd_coprocessor_error(void);

#if defined(CONFIG_X86_64) && defined(CONFIG_XEN_PV)
@@ -93,6 +96,10 @@ dotraplinkage void do_alignment_check(struct pt_regs *regs, long error_code);
dotraplinkage void do_machine_check(struct pt_regs *regs, long error_code);
#endif
dotraplinkage void do_simd_coprocessor_error(struct pt_regs *regs, long error_code);
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+dotraplinkage void do_vmm_communication_error(struct pt_regs *regs,
+ long error_code);
+#endif
#ifdef CONFIG_X86_32
dotraplinkage void do_iret_error(struct pt_regs *regs, long error_code);
#endif
diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
index 135d208a2d38..25fa8ba70993 100644
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -88,8 +88,10 @@ static const __initconst struct idt_data def_idts[] = {
#ifdef CONFIG_X86_MCE
INTG(X86_TRAP_MC, &machine_check),
#endif
-
SYSG(X86_TRAP_OF, overflow),
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+ INTG(X86_TRAP_VC, vmm_communication),
+#endif
#if defined(CONFIG_IA32_EMULATION)
SYSG(IA32_SYSCALL_VECTOR, entry_INT80_compat),
#elif defined(CONFIG_X86_32)
diff --git a/arch/x86/kernel/sev-es.c b/arch/x86/kernel/sev-es.c
index 4bf5286310a0..97241d2f0f70 100644
--- a/arch/x86/kernel/sev-es.c
+++ b/arch/x86/kernel/sev-es.c
@@ -20,7 +20,7 @@
#include <asm/insn-eval.h>
#include <asm/fpu/internal.h>
#include <asm/processor.h>
-#include <asm/trap_defs.h>
+#include <asm/traps.h>
#include <asm/svm.h>

/* For early boot hypervisor communication in SEV-ES enabled guests */
@@ -251,6 +251,81 @@ static enum es_result vc_handle_exitcode(struct es_em_ctxt *ctxt,
return result;
}

+static void vc_forward_exception(struct es_em_ctxt *ctxt)
+{
+ long error_code = ctxt->fi.error_code;
+ int trapnr = ctxt->fi.vector;
+
+ ctxt->regs->orig_ax = ctxt->fi.error_code;
+
+ switch (trapnr) {
+ case X86_TRAP_GP:
+ do_general_protection(ctxt->regs, error_code);
+ break;
+ case X86_TRAP_UD:
+ do_invalid_op(ctxt->regs, 0);
+ break;
+ default:
+ BUG();
+ }
+}
+
+dotraplinkage void do_vmm_communication(struct pt_regs *regs, unsigned long exit_code)
+{
+ struct es_em_ctxt ctxt;
+ enum es_result result;
+ struct ghcb *ghcb;
+
+ /*
+ * This is invoked through an interrupt gate, so IRQs are disabled. The
+ * code below might walk page-tables for user or kernel addresses, so
+ * keep the IRQs disabled to protect us against concurrent TLB flushes.
+ */
+
+ ghcb = (struct ghcb *)this_cpu_ptr(ghcb_page);
+
+ vc_ghcb_invalidate(ghcb);
+ result = vc_init_em_ctxt(&ctxt, regs, exit_code);
+
+ if (result == ES_OK)
+ result = vc_handle_exitcode(&ctxt, ghcb, exit_code);
+
+ /* Done - now check the result */
+ switch (result) {
+ case ES_OK:
+ vc_finish_insn(&ctxt);
+ break;
+ case ES_UNSUPPORTED:
+ pr_emerg("Unsupported exit-code 0x%02lx in early #VC exception (IP: 0x%lx)\n",
+ exit_code, regs->ip);
+ goto fail;
+ case ES_VMM_ERROR:
+ pr_emerg("PANIC: Failure in communication with VMM (exit-code 0x%02lx IP: 0x%lx)\n",
+ exit_code, regs->ip);
+ goto fail;
+ case ES_DECODE_FAILED:
+ pr_emerg("PANIC: Failed to decode instruction (exit-code 0x%02lx IP: 0x%lx)\n",
+ exit_code, regs->ip);
+ goto fail;
+ case ES_EXCEPTION:
+ vc_forward_exception(&ctxt);
+ break;
+ case ES_RETRY:
+ /* Nothing to do */
+ break;
+ default:
+ BUG();
+ }
+
+ return;
+
+fail:
+ show_regs(regs);
+
+ while (true)
+ halt();
+}
+
bool __init boot_vc_exception(struct pt_regs *regs)
{
unsigned long exit_code = regs->orig_ax;
--
2.17.1


2020-03-19 15:45:13

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 41/70] x86/sev-es: Add Runtime #VC Exception Handler

On Thu, Mar 19, 2020 at 2:14 AM Joerg Roedel <[email protected]> wrote:
>
> From: Tom Lendacky <[email protected]>
>
> Add the handler for #VC exceptions invoked at runtime.

If I read this correctly, this does not use IST. If that's true, I
don't see how this can possibly work. There at least two nasty cases
that come to mind:

1. SYSCALL followed by NMI. The NMI IRET hack gets to #VC and we
explode. This is fixable by getting rid of the NMI EFLAGS.TF hack.

2. tools/testing/selftests/x86/mov_ss_trap_64. User code does MOV
(addr), SS; SYSCALL, where addr has a data breakpoint. We get #DB
promoted to #VC with no stack.

2020-03-19 16:26:39

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH 41/70] x86/sev-es: Add Runtime #VC Exception Handler

On Thu, Mar 19, 2020 at 08:44:03AM -0700, Andy Lutomirski wrote:
> On Thu, Mar 19, 2020 at 2:14 AM Joerg Roedel <[email protected]> wrote:
> >
> > From: Tom Lendacky <[email protected]>
> >
> > Add the handler for #VC exceptions invoked at runtime.
>
> If I read this correctly, this does not use IST. If that's true, I
> don't see how this can possibly work. There at least two nasty cases
> that come to mind:
>
> 1. SYSCALL followed by NMI. The NMI IRET hack gets to #VC and we
> explode. This is fixable by getting rid of the NMI EFLAGS.TF hack.

Not an issue in this patch-set, the confusion comes from the fact that I
left some parts of the single-step-over-iret code in the patch. But it
is not used. The NMI handling in this patch-set sends the NMI-complete
message before the IRET, when the kernel is still in a safe environment
(kernel stack, kernel cr3).

> 2. tools/testing/selftests/x86/mov_ss_trap_64. User code does MOV
> (addr), SS; SYSCALL, where addr has a data breakpoint. We get #DB
> promoted to #VC with no stack.

Also not an issue, as debugging is not supported at the moment in SEV-ES
guests (hardware has no way yet to save/restore the debug registers
across #VMEXITs). But this will change with future hardware. If you look
at the implementation for dr7 read/write events, you see that the dr7
value is cached and returned, but does not make it to the hardware dr7.

I though about using IST for the #VC handler, but the implications for
nesting #VC handlers made me decide against it. But for future hardware
that supports debugging inside SEV-ES guests it will be an issue. I'll
think about how to fix the problem, it probably has to be IST :(

Regards,

Joerg

2020-03-19 18:45:28

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 41/70] x86/sev-es: Add Runtime #VC Exception Handler

On Thu, Mar 19, 2020 at 9:24 AM Joerg Roedel <[email protected]> wrote:
>
> On Thu, Mar 19, 2020 at 08:44:03AM -0700, Andy Lutomirski wrote:
> > On Thu, Mar 19, 2020 at 2:14 AM Joerg Roedel <[email protected]> wrote:
> > >
> > > From: Tom Lendacky <[email protected]>
> > >
> > > Add the handler for #VC exceptions invoked at runtime.
> >
> > If I read this correctly, this does not use IST. If that's true, I
> > don't see how this can possibly work. There at least two nasty cases
> > that come to mind:
> >
> > 1. SYSCALL followed by NMI. The NMI IRET hack gets to #VC and we
> > explode. This is fixable by getting rid of the NMI EFLAGS.TF hack.
>
> Not an issue in this patch-set, the confusion comes from the fact that I
> left some parts of the single-step-over-iret code in the patch. But it
> is not used. The NMI handling in this patch-set sends the NMI-complete
> message before the IRET, when the kernel is still in a safe environment
> (kernel stack, kernel cr3).

Got it!

>
> > 2. tools/testing/selftests/x86/mov_ss_trap_64. User code does MOV
> > (addr), SS; SYSCALL, where addr has a data breakpoint. We get #DB
> > promoted to #VC with no stack.
>
> Also not an issue, as debugging is not supported at the moment in SEV-ES
> guests (hardware has no way yet to save/restore the debug registers
> across #VMEXITs). But this will change with future hardware. If you look
> at the implementation for dr7 read/write events, you see that the dr7
> value is cached and returned, but does not make it to the hardware dr7.

Eek. This would probably benefit from some ptrace / perf logic to
prevent the kernel or userspace from thinking that debugging works.

I guess this means that #DB only happens due to TF or INT01. I
suppose this is probably okay.

>
> I though about using IST for the #VC handler, but the implications for
> nesting #VC handlers made me decide against it. But for future hardware
> that supports debugging inside SEV-ES guests it will be an issue. I'll
> think about how to fix the problem, it probably has to be IST :(

Or future generations could have enough hardware support for debugging
that #DB doesn't need to be intercepted or can be re-injected
correctly with the #DB vector.

>
> Regards,
>
> Joerg

2020-03-19 19:38:56

by Jörg Rödel

[permalink] [raw]
Subject: Re: [PATCH 41/70] x86/sev-es: Add Runtime #VC Exception Handler

On Thu, Mar 19, 2020 at 11:43:20AM -0700, Andy Lutomirski wrote:
> Or future generations could have enough hardware support for debugging
> that #DB doesn't need to be intercepted or can be re-injected
> correctly with the #DB vector.

Yeah, the problem is, the GHCB spec suggests the single-step-over-iret
way to re-enable the NMI window and requires intercepting #DB for it. So
the hypervisor probably still has to intercept it, even when debug
support is added some day. I need to think more about this.

Regards,

Joerg