Don't advance RIP or inject a single-step #DB if emulation signals a
fault. This logic applies to all state updates that are conditional on
clean retirement of the emulation instruction, e.g. updating RFLAGS was
previously handled by commit 38827dbd3fb85 ("KVM: x86: Do not update
EFLAGS on faulting emulation").
Not advancing RIP is likely a nop, i.e. ctxt->eip isn't updated with
ctxt->_eip until emulation "retires" anyways. Skipping #DB injection
fixes a bug reported by Andy Lutomirski where a #UD on SYSCALL due to
invalid state with RFLAGS.RF=1 would loop indefinitely due to emulation
overwriting the #UD with #DB and thus restarting the bad SYSCALL over
and over.
Cc: Nadav Amit <[email protected]>
Cc: [email protected]
Reported-by: Andy Lutomirski <[email protected]>
Fixes: 663f4c61b803 ("KVM: x86: handle singlestep during emulation")
Signed-off-by: Sean Christopherson <[email protected]>
---
Note, this has minor conflict with my recent series to cleanup the
emulator return flows[*]. The end result should look something like:
if (!ctxt->have_exception ||
exception_type(ctxt->exception.vector) == EXCPT_TRAP) {
kvm_rip_write(vcpu, ctxt->eip);
if (r && ctxt->tf)
r = kvm_vcpu_do_singlestep(vcpu);
__kvm_set_rflags(vcpu, ctxt->eflags);
}
[*] https://lkml.kernel.org/r/[email protected]
arch/x86/kvm/x86.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b4cfd786d0b6..d2962671c3d3 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6611,12 +6611,13 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
unsigned long rflags = kvm_x86_ops->get_rflags(vcpu);
toggle_interruptibility(vcpu, ctxt->interruptibility);
vcpu->arch.emulate_regs_need_sync_to_vcpu = false;
- kvm_rip_write(vcpu, ctxt->eip);
- if (r == EMULATE_DONE && ctxt->tf)
- kvm_vcpu_do_singlestep(vcpu, &r);
if (!ctxt->have_exception ||
- exception_type(ctxt->exception.vector) == EXCPT_TRAP)
+ exception_type(ctxt->exception.vector) == EXCPT_TRAP) {
+ kvm_rip_write(vcpu, ctxt->eip);
+ if (r == EMULATE_DONE && ctxt->tf)
+ kvm_vcpu_do_singlestep(vcpu, &r);
__kvm_set_rflags(vcpu, ctxt->eflags);
+ }
/*
* For STI, interrupts are shadowed; so KVM_REQ_EVENT will
--
2.22.0
2019-08-23 13:55-0700, Sean Christopherson:
> Don't advance RIP or inject a single-step #DB if emulation signals a
> fault. This logic applies to all state updates that are conditional on
> clean retirement of the emulation instruction, e.g. updating RFLAGS was
> previously handled by commit 38827dbd3fb85 ("KVM: x86: Do not update
> EFLAGS on faulting emulation").
>
> Not advancing RIP is likely a nop, i.e. ctxt->eip isn't updated with
> ctxt->_eip until emulation "retires" anyways. Skipping #DB injection
> fixes a bug reported by Andy Lutomirski where a #UD on SYSCALL due to
> invalid state with RFLAGS.RF=1 would loop indefinitely due to emulation
> overwriting the #UD with #DB and thus restarting the bad SYSCALL over
> and over.
>
> Cc: Nadav Amit <[email protected]>
> Cc: [email protected]
> Reported-by: Andy Lutomirski <[email protected]>
> Fixes: 663f4c61b803 ("KVM: x86: handle singlestep during emulation")
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
>
> Note, this has minor conflict with my recent series to cleanup the
> emulator return flows[*]. The end result should look something like:
>
> if (!ctxt->have_exception ||
> exception_type(ctxt->exception.vector) == EXCPT_TRAP) {
> kvm_rip_write(vcpu, ctxt->eip);
> if (r && ctxt->tf)
> r = kvm_vcpu_do_singlestep(vcpu);
> __kvm_set_rflags(vcpu, ctxt->eflags);
> }
>
> [*] https://lkml.kernel.org/r/[email protected]
>
> arch/x86/kvm/x86.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index b4cfd786d0b6..d2962671c3d3 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6611,12 +6611,13 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
> unsigned long rflags = kvm_x86_ops->get_rflags(vcpu);
> toggle_interruptibility(vcpu, ctxt->interruptibility);
> vcpu->arch.emulate_regs_need_sync_to_vcpu = false;
> - kvm_rip_write(vcpu, ctxt->eip);
> - if (r == EMULATE_DONE && ctxt->tf)
> - kvm_vcpu_do_singlestep(vcpu, &r);
> if (!ctxt->have_exception ||
> - exception_type(ctxt->exception.vector) == EXCPT_TRAP)
> + exception_type(ctxt->exception.vector) == EXCPT_TRAP) {
Hm, EXCPT_TRAP is either #OF, #BP, or another #DB, none of which we want
to override. The first two disable TF and the last one is the same as
its fault variant must take other path, so it works out in the end...
I've fixed the RF in commit message when applying, thanks.
---
We still seem to have at least a minor problem with single stepping:
SDM, Interrupt 1—Debug Exception (#DB):
The following items detail the treatment of debug exceptions on the
instruction boundary following execution of the MOV or the POP
instruction that loads the SS register:
• If EFLAGS.TF is 1, no single-step trap is generated.
I think a check for KVM_X86_SHADOW_INT_MOV_SS in
kvm_vcpu_do_singlestep() is missing.
On Fri, Aug 23, 2019 at 1:55 PM Sean Christopherson
<[email protected]> wrote:
>
> Don't advance RIP or inject a single-step #DB if emulation signals a
> fault. This logic applies to all state updates that are conditional on
> clean retirement of the emulation instruction, e.g. updating RFLAGS was
> previously handled by commit 38827dbd3fb85 ("KVM: x86: Do not update
> EFLAGS on faulting emulation").
>
> Not advancing RIP is likely a nop, i.e. ctxt->eip isn't updated with
> ctxt->_eip until emulation "retires" anyways. Skipping #DB injection
> fixes a bug reported by Andy Lutomirski where a #UD on SYSCALL due to
> invalid state with RFLAGS.RF=1 would loop indefinitely due to emulation
> overwriting the #UD with #DB and thus restarting the bad SYSCALL over
> and over.
>
> Cc: Nadav Amit <[email protected]>
> Cc: [email protected]
> Reported-by: Andy Lutomirski <[email protected]>
> Fixes: 663f4c61b803 ("KVM: x86: handle singlestep during emulation")
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
>
> Note, this has minor conflict with my recent series to cleanup the
> emulator return flows[*]. The end result should look something like:
>
> if (!ctxt->have_exception ||
> exception_type(ctxt->exception.vector) == EXCPT_TRAP) {
> kvm_rip_write(vcpu, ctxt->eip);
> if (r && ctxt->tf)
> r = kvm_vcpu_do_singlestep(vcpu);
> __kvm_set_rflags(vcpu, ctxt->eflags);
> }
>
> [*] https://lkml.kernel.org/r/[email protected]
>
> arch/x86/kvm/x86.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index b4cfd786d0b6..d2962671c3d3 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6611,12 +6611,13 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
> unsigned long rflags = kvm_x86_ops->get_rflags(vcpu);
> toggle_interruptibility(vcpu, ctxt->interruptibility);
> vcpu->arch.emulate_regs_need_sync_to_vcpu = false;
> - kvm_rip_write(vcpu, ctxt->eip);
> - if (r == EMULATE_DONE && ctxt->tf)
> - kvm_vcpu_do_singlestep(vcpu, &r);
> if (!ctxt->have_exception ||
> - exception_type(ctxt->exception.vector) == EXCPT_TRAP)
> + exception_type(ctxt->exception.vector) == EXCPT_TRAP) {
NYC, but...
I don't think this check for "exception_type" is quite right. A
general detect fault (which can be synthesized by check_dr_read) is
mischaracterized by exception_type() as a trap. Or maybe I'm missing
something? (I often am.)
> + kvm_rip_write(vcpu, ctxt->eip);
> + if (r == EMULATE_DONE && ctxt->tf)
> + kvm_vcpu_do_singlestep(vcpu, &r);
> __kvm_set_rflags(vcpu, ctxt->eflags);
> + }
>
> /*
> * For STI, interrupts are shadowed; so KVM_REQ_EVENT will
> --
> 2.22.0
>
On Tue, Aug 27, 2019 at 12:12:51PM -0700, Jim Mattson wrote:
> On Fri, Aug 23, 2019 at 1:55 PM Sean Christopherson
> <[email protected]> wrote:
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -6611,12 +6611,13 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
> > unsigned long rflags = kvm_x86_ops->get_rflags(vcpu);
> > toggle_interruptibility(vcpu, ctxt->interruptibility);
> > vcpu->arch.emulate_regs_need_sync_to_vcpu = false;
> > - kvm_rip_write(vcpu, ctxt->eip);
> > - if (r == EMULATE_DONE && ctxt->tf)
> > - kvm_vcpu_do_singlestep(vcpu, &r);
> > if (!ctxt->have_exception ||
> > - exception_type(ctxt->exception.vector) == EXCPT_TRAP)
> > + exception_type(ctxt->exception.vector) == EXCPT_TRAP) {
>
> NYC, but...
>
> I don't think this check for "exception_type" is quite right. A
> general detect fault (which can be synthesized by check_dr_read) is
> mischaracterized by exception_type() as a trap. Or maybe I'm missing
> something? (I often am.)
Pretty sure you're not missing anything.
And while we're poking holes in #DB emulation, int1/icebp isn't emulated
correctly as it should be reinjected with INTR_TYPE_PRIV_SW_EXCEPTION, not
as a INTR_TYPE_HARD_EXCEPTION. The CPU automically clears DR7.GD on #DB,
unless the #DB is due to int1...