2023-12-06 08:38:01

by Li, Xin3

[permalink] [raw]
Subject: RE: [PATCH v1 13/23] KVM: VMX: Handle VMX nested exception for FRED

> Subject: RE: [PATCH v1 13/23] KVM: VMX: Handle VMX nested exception for FRED
>
> > >+ if (idt_vectoring_info &
> VECTORING_INFO_DELIVER_CODE_MASK)
> > >+ kvm_requeue_exception_e(vcpu, vector,
> > vmcs_read32(error_code_field),
> > >+ idt_vectoring_info &
> > INTR_INFO_NESTED_EXCEPTION_MASK);
> > >+ else
> > >+ kvm_requeue_exception(vcpu, vector,
> > >+ idt_vectoring_info &
> > INTR_INFO_NESTED_EXCEPTION_MASK);
> >
> > Exiting-event identification can also have bit 13 set, indicating a
> > nested exception encountered and caused VM-exit. when reinjecting the
> > exception to guests, kvm needs to set the "nested" bit, right? I
> > suspect some changes to e.g., handle_exception_nmi() are needed.
>
> The current patch relies on kvm_multiple_exception() to do that. But TBH, I'm
> not sure it can recognize all nested cases. I probably should revisit it.

So the conclusion is that kvm_multiple_exception() is smart enough, and
a VMM doesn't have to check bit 13 of the Exiting-event identification.

In FRED spec 5.0, section 9.2 - New VMX Feature: VMX Nested-Exception
Support, there is a statement at the end of Exiting-event identification:

(The value of this bit is always identical to that of the valid bit of
the original-event identification field.)

It means that even w/o VMX Nested-Exception support, a VMM already knows
if an exception is a nested exception encountered during delivery of
another event in an exception caused VM exit (exit reason 0). This is
done in KVM through reading IDT_VECTORING_INFO_FIELD and calling
vmx_complete_interrupts() immediately after VM exits.

vmx_complete_interrupts() simply queues the original exception if there is
one, and later the nested exception causing the VM exit could be cancelled
if it is a shadow page fault. However if the shadow page fault is caused
by a guest page fault, KVM injects it as a nested exception to have guest
fix its page table.

I will add comments about this background in the next iteration.


2023-12-07 08:42:59

by Chao Gao

[permalink] [raw]
Subject: Re: [PATCH v1 13/23] KVM: VMX: Handle VMX nested exception for FRED

On Wed, Dec 06, 2023 at 04:37:39PM +0800, Li, Xin3 wrote:
>> Subject: RE: [PATCH v1 13/23] KVM: VMX: Handle VMX nested exception for FRED
>>
>> > >+ if (idt_vectoring_info &
>> VECTORING_INFO_DELIVER_CODE_MASK)
>> > >+ kvm_requeue_exception_e(vcpu, vector,
>> > vmcs_read32(error_code_field),
>> > >+ idt_vectoring_info &
>> > INTR_INFO_NESTED_EXCEPTION_MASK);
>> > >+ else
>> > >+ kvm_requeue_exception(vcpu, vector,
>> > >+ idt_vectoring_info &
>> > INTR_INFO_NESTED_EXCEPTION_MASK);
>> >
>> > Exiting-event identification can also have bit 13 set, indicating a
>> > nested exception encountered and caused VM-exit. when reinjecting the
>> > exception to guests, kvm needs to set the "nested" bit, right? I
>> > suspect some changes to e.g., handle_exception_nmi() are needed.
>>
>> The current patch relies on kvm_multiple_exception() to do that. But TBH, I'm
>> not sure it can recognize all nested cases. I probably should revisit it.
>
>So the conclusion is that kvm_multiple_exception() is smart enough, and
>a VMM doesn't have to check bit 13 of the Exiting-event identification.
>
>In FRED spec 5.0, section 9.2 - New VMX Feature: VMX Nested-Exception
>Support, there is a statement at the end of Exiting-event identification:
>
>(The value of this bit is always identical to that of the valid bit of
>the original-event identification field.)
>
>It means that even w/o VMX Nested-Exception support, a VMM already knows
>if an exception is a nested exception encountered during delivery of
>another event in an exception caused VM exit (exit reason 0). This is
>done in KVM through reading IDT_VECTORING_INFO_FIELD and calling
>vmx_complete_interrupts() immediately after VM exits.
>
>vmx_complete_interrupts() simply queues the original exception if there is
>one, and later the nested exception causing the VM exit could be cancelled
>if it is a shadow page fault. However if the shadow page fault is caused
>by a guest page fault, KVM injects it as a nested exception to have guest
>fix its page table.
>
>I will add comments about this background in the next iteration.

is it possible that the CPU encounters an exception and causes VM-exit during
injecting an __interrupt__? in this case, no __exception__ will be (re-)queued
by vmx_complete_interrupts().

2023-12-07 10:10:14

by Li, Xin3

[permalink] [raw]
Subject: RE: [PATCH v1 13/23] KVM: VMX: Handle VMX nested exception for FRED

> >> > Exiting-event identification can also have bit 13 set, indicating a
> >> > nested exception encountered and caused VM-exit. when reinjecting the
> >> > exception to guests, kvm needs to set the "nested" bit, right? I
> >> > suspect some changes to e.g., handle_exception_nmi() are needed.
> >>
> >> The current patch relies on kvm_multiple_exception() to do that. But TBH, I'm
> >> not sure it can recognize all nested cases. I probably should revisit it.
> >
> >So the conclusion is that kvm_multiple_exception() is smart enough, and
> >a VMM doesn't have to check bit 13 of the Exiting-event identification.
> >
> >In FRED spec 5.0, section 9.2 - New VMX Feature: VMX Nested-Exception
> >Support, there is a statement at the end of Exiting-event identification:
> >
> >(The value of this bit is always identical to that of the valid bit of
> >the original-event identification field.)
> >
> >It means that even w/o VMX Nested-Exception support, a VMM already knows
> >if an exception is a nested exception encountered during delivery of
> >another event in an exception caused VM exit (exit reason 0). This is
> >done in KVM through reading IDT_VECTORING_INFO_FIELD and calling
> >vmx_complete_interrupts() immediately after VM exits.
> >
> >vmx_complete_interrupts() simply queues the original exception if there is
> >one, and later the nested exception causing the VM exit could be cancelled
> >if it is a shadow page fault. However if the shadow page fault is caused
> >by a guest page fault, KVM injects it as a nested exception to have guest
> >fix its page table.
> >
> >I will add comments about this background in the next iteration.
>
> is it possible that the CPU encounters an exception and causes VM-exit during
> injecting an __interrupt__? in this case, no __exception__ will be (re-)queued
> by vmx_complete_interrupts().

I guess the following case is what you're suggesting:
KVM injects an external interrupt after shadow page tables are nuked.

vmx_complete_interrupts() are called after each VM exit to clear both
interrupt and exception queues, which means it always pushes the
deepest event if there is an original event. In the above case, the
original event is the external interrupt KVM just tried to inject.

2023-12-08 01:57:04

by Chao Gao

[permalink] [raw]
Subject: Re: [PATCH v1 13/23] KVM: VMX: Handle VMX nested exception for FRED

On Thu, Dec 07, 2023 at 06:09:46PM +0800, Li, Xin3 wrote:
>> >> > Exiting-event identification can also have bit 13 set, indicating a
>> >> > nested exception encountered and caused VM-exit. when reinjecting the
>> >> > exception to guests, kvm needs to set the "nested" bit, right? I
>> >> > suspect some changes to e.g., handle_exception_nmi() are needed.
>> >>
>> >> The current patch relies on kvm_multiple_exception() to do that. But TBH, I'm
>> >> not sure it can recognize all nested cases. I probably should revisit it.
>> >
>> >So the conclusion is that kvm_multiple_exception() is smart enough, and
>> >a VMM doesn't have to check bit 13 of the Exiting-event identification.
>> >
>> >In FRED spec 5.0, section 9.2 - New VMX Feature: VMX Nested-Exception
>> >Support, there is a statement at the end of Exiting-event identification:
>> >
>> >(The value of this bit is always identical to that of the valid bit of
>> >the original-event identification field.)
>> >
>> >It means that even w/o VMX Nested-Exception support, a VMM already knows
>> >if an exception is a nested exception encountered during delivery of
>> >another event in an exception caused VM exit (exit reason 0). This is
>> >done in KVM through reading IDT_VECTORING_INFO_FIELD and calling
>> >vmx_complete_interrupts() immediately after VM exits.
>> >
>> >vmx_complete_interrupts() simply queues the original exception if there is
>> >one, and later the nested exception causing the VM exit could be cancelled
>> >if it is a shadow page fault. However if the shadow page fault is caused
>> >by a guest page fault, KVM injects it as a nested exception to have guest
>> >fix its page table.
>> >
>> >I will add comments about this background in the next iteration.
>>
>> is it possible that the CPU encounters an exception and causes VM-exit during
>> injecting an __interrupt__? in this case, no __exception__ will be (re-)queued
>> by vmx_complete_interrupts().
>
>I guess the following case is what you're suggesting:
>KVM injects an external interrupt after shadow page tables are nuked.
>
>vmx_complete_interrupts() are called after each VM exit to clear both
>interrupt and exception queues, which means it always pushes the
>deepest event if there is an original event. In the above case, the
>original event is the external interrupt KVM just tried to inject.

in my understanding, your point is:
1. if bit 13 of the Exiting-event identification is set. the original-event
identification field should be valid.
2. vmx_complete_interrupts() is done immediately after VM exits and reads
original-event identification and reinjects the event there.
3. if KVM injects the exception in exiting-event identification
to guest, KVM doesn't need to read the bit 13 because kvm_multiple_exception()
is "smart enough" and recognize the exception as nested-exception because if
bit 13 is 1, one exception must has been queued in #2.

my question is:
what if the event in original-event identification is an interrupt e.g.,
external interrupt or NMI, rather than exception. vmx_complete_interrupts()
won't queue an exception, then how can KVM or kvm_multiple_exception() know the
exception that caused VM-exit is an nested exception w/o reading bit 13 of the
Exiting-event identification?

2023-12-08 23:49:01

by Li, Xin3

[permalink] [raw]
Subject: RE: [PATCH v1 13/23] KVM: VMX: Handle VMX nested exception for FRED

> >> >> > Exiting-event identification can also have bit 13 set, indicating a
> >> >> > nested exception encountered and caused VM-exit. when reinjecting the
> >> >> > exception to guests, kvm needs to set the "nested" bit, right? I
> >> >> > suspect some changes to e.g., handle_exception_nmi() are needed.
> >> >>
> >> >> The current patch relies on kvm_multiple_exception() to do that. But TBH,
> I'm
> >> >> not sure it can recognize all nested cases. I probably should revisit it.
> >> >
> >> >So the conclusion is that kvm_multiple_exception() is smart enough, and
> >> >a VMM doesn't have to check bit 13 of the Exiting-event identification.
> >> >
> >> >In FRED spec 5.0, section 9.2 - New VMX Feature: VMX Nested-Exception
> >> >Support, there is a statement at the end of Exiting-event identification:
> >> >
> >> >(The value of this bit is always identical to that of the valid bit of
> >> >the original-event identification field.)
> >> >
> >> >It means that even w/o VMX Nested-Exception support, a VMM already
> knows
> >> >if an exception is a nested exception encountered during delivery of
> >> >another event in an exception caused VM exit (exit reason 0). This is
> >> >done in KVM through reading IDT_VECTORING_INFO_FIELD and calling
> >> >vmx_complete_interrupts() immediately after VM exits.
> >> >
> >> >vmx_complete_interrupts() simply queues the original exception if there is
> >> >one, and later the nested exception causing the VM exit could be cancelled
> >> >if it is a shadow page fault. However if the shadow page fault is caused
> >> >by a guest page fault, KVM injects it as a nested exception to have guest
> >> >fix its page table.
> >> >
> >> >I will add comments about this background in the next iteration.
> >>
> >> is it possible that the CPU encounters an exception and causes VM-exit during
> >> injecting an __interrupt__? in this case, no __exception__ will be (re-)queued
> >> by vmx_complete_interrupts().
> >
> >I guess the following case is what you're suggesting:
> >KVM injects an external interrupt after shadow page tables are nuked.
> >
> >vmx_complete_interrupts() are called after each VM exit to clear both
> >interrupt and exception queues, which means it always pushes the
> >deepest event if there is an original event. In the above case, the
> >original event is the external interrupt KVM just tried to inject.
>
> in my understanding, your point is:
> 1. if bit 13 of the Exiting-event identification is set. the original-event
> identification field should be valid.
> 2. vmx_complete_interrupts() is done immediately after VM exits and reads
> original-event identification and reinjects the event there.
> 3. if KVM injects the exception in exiting-event identification
> to guest, KVM doesn't need to read the bit 13 because kvm_multiple_exception()
> is "smart enough" and recognize the exception as nested-exception because if
> bit 13 is 1, one exception must has been queued in #2.
>
> my question is:
> what if the event in original-event identification is an interrupt e.g.,
> external interrupt or NMI, rather than exception. vmx_complete_interrupts()
> won't queue an exception, then how can KVM or kvm_multiple_exception()
> know the
> exception that caused VM-exit is an nested exception w/o reading bit 13 of the
> Exiting-event identification?

The good news is that vmx_complete_interrupts() still queues the event
even it's not a hardware exception. It's just that kvm_multiple_exception()
doesn't check if there is an original interrupt or NMI because IDT event
delivery doesn't care such a case.

I think your point is more of that we should check it when FRED is enabled
for a guest. Yes, architecturally we should do it.

What I want to emphasize is that bit 13 of the exiting-event identification
is set to the valid bit of the original-event identification, they are
logically the same thing when FRED is enabled. It doens't matter which one
a VMM reads and uses. But a VMM doesn't need to differentiate FRED and IDT
if it reads the info from original-event identification.