LinuxLists.cc - [PATCH v2 00/11] SVM: vNMI (with my fixes)

2022-11-29 20:31:22

Subject: [PATCH v2 00/11] SVM: vNMI (with my fixes)

Hi!

This is the vNMI patch series based on Santosh Shukla's vNMI patch series.

In this version of this patch series I addressed most of the review feedback

added some more refactoring and also I think fixed the issue with migration.

I only tested this on a machine which doesn't have vNMI, so this does need

some testing to ensure that nothing is broken.

Best regards,

Maxim Levitsky

Maxim Levitsky (9):

KVM: nSVM: don't sync back tlb_ctl on nested VM exit

KVM: nSVM: clean up the copying of V_INTR bits from vmcb02 to vmcb12

KVM: nSVM: explicitly raise KVM_REQ_EVENT on nested VM exit if L1

doesn't intercept interrupts

KVM: SVM: drop the SVM specific H_FLAGS

KVM: x86: emulator: stop using raw host flags

KVM: SVM: add wrappers to enable/disable IRET interception

KVM: x86: add a delayed hardware NMI injection interface

KVM: SVM: implement support for vNMI

KVM: nSVM: implement support for nested VNMI

Santosh Shukla (2):

x86/cpu: Add CPUID feature bit for VNMI

KVM: SVM: Add VNMI bit definition

arch/x86/include/asm/cpufeatures.h | 1 +

arch/x86/include/asm/kvm-x86-ops.h | 2 +

arch/x86/include/asm/kvm_host.h | 24 +++--

arch/x86/include/asm/svm.h | 7 ++

arch/x86/kvm/emulate.c | 11 +--

arch/x86/kvm/kvm_emulate.h | 7 +-

arch/x86/kvm/smm.c | 2 -

arch/x86/kvm/svm/nested.c | 102 ++++++++++++++++---

arch/x86/kvm/svm/svm.c | 154 ++++++++++++++++++++++-------

arch/x86/kvm/svm/svm.h | 41 +++++++-

arch/x86/kvm/x86.c | 50 ++++++++--

11 files changed, 318 insertions(+), 83 deletions(-)

--

2.26.3

2022-11-29 20:31:28

by Maxim Levitsky

[permalink] [raw]

Subject: [PATCH v2 01/11] KVM: nSVM: don't sync back tlb_ctl on nested VM exit

The CPU doesn't change TLB_CTL value as stated in the PRM (15.16.2):

"The VMRUN instruction reads, but does not change, the
value of the TLB_CONTROL field"

Therefore the KVM shouldn't do that either.

Signed-off-by: Maxim Levitsky <[email protected]>
---
arch/x86/kvm/svm/nested.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index bc9cd7086fa972..37af0338da7c32 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1010,7 +1010,6 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
vmcb12->control.next_rip = vmcb02->control.next_rip;

vmcb12->control.int_ctl = svm->nested.ctl.int_ctl;
- vmcb12->control.tlb_ctl = svm->nested.ctl.tlb_ctl;
vmcb12->control.event_inj = svm->nested.ctl.event_inj;
vmcb12->control.event_inj_err = svm->nested.ctl.event_inj_err;

--
2.26.3

2022-11-29 20:31:30

by Maxim Levitsky

[permalink] [raw]

Subject: [PATCH v2 09/11] KVM: SVM: Add VNMI bit definition

From: Santosh Shukla <[email protected]>

VNMI exposes 3 capability bits (V_NMI, V_NMI_MASK, and V_NMI_ENABLE) to
virtualize NMI and NMI_MASK, Those capability bits are part of
VMCB::intr_ctrl -
V_NMI(11) - Indicates whether a virtual NMI is pending in the guest.
V_NMI_MASK(12) - Indicates whether virtual NMI is masked in the guest.
V_NMI_ENABLE(26) - Enables the NMI virtualization feature for the guest.

When Hypervisor wants to inject NMI, it will set V_NMI bit, Processor
will clear the V_NMI bit and Set the V_NMI_MASK which means the Guest is
handling NMI, After the guest handled the NMI, The processor will clear
the V_NMI_MASK on the successful completion of IRET instruction Or if
VMEXIT occurs while delivering the virtual NMI.

To enable the VNMI capability, Hypervisor need to program
V_NMI_ENABLE bit 1.

Reviewed-by: Maxim Levitsky <[email protected]>
Signed-off-by: Santosh Shukla <[email protected]>
---
arch/x86/include/asm/svm.h | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index cb1ee53ad3b189..26d6f549ce2b46 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -203,6 +203,13 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
#define X2APIC_MODE_SHIFT 30
#define X2APIC_MODE_MASK (1 << X2APIC_MODE_SHIFT)

+#define V_NMI_PENDING_SHIFT 11
+#define V_NMI_PENDING (1 << V_NMI_PENDING_SHIFT)
+#define V_NMI_MASK_SHIFT 12
+#define V_NMI_MASK (1 << V_NMI_MASK_SHIFT)
+#define V_NMI_ENABLE_SHIFT 26
+#define V_NMI_ENABLE (1 << V_NMI_ENABLE_SHIFT)
+
#define LBR_CTL_ENABLE_MASK BIT_ULL(0)
#define VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK BIT_ULL(1)

--
2.26.3

2022-11-29 20:31:45

by Maxim Levitsky

[permalink] [raw]

Subject: [PATCH v2 03/11] KVM: nSVM: explicitly raise KVM_REQ_EVENT on nested VM exit if L1 doesn't intercept interrupts

If the L2 doesn't intercept interrupts, then the KVM will use vmcb02's
V_IRQ for L1 (to detect the interrupt window)

In this case on the nested VM exit KVM might need to copy the V_IRQ bit
from the vmcb02 to the vmcb01, to continue waiting for the
interrupt window.

To make it simple, just raise the KVM_REQ_EVENT request, which
execution will lead to the reenabling of the interrupt
window if needed.

Note that this is a theoretical bug because the KVM already does raise
the KVM_REQ_EVENT request one each nested VM exit because the nested
VM exit resets RFLAGS and the kvm_set_rflags() raises the
KVM_REQ_EVENT request in the response.

However raising this request explicitly, together with
documenting why this is needed, is still preferred.

Signed-off-by: Maxim Levitsky <[email protected]>
---
arch/x86/kvm/svm/nested.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index aad3145b2f62fe..e891318595113e 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1016,6 +1016,31 @@ int nested_svm_vmexit(struct vcpu_svm *svm)

svm_switch_vmcb(svm, &svm->vmcb01);

+ /* Note about synchronizing some of int_ctl bits from vmcb02 to vmcb01:
+ *
+ * - V_IRQ, V_IRQ_VECTOR, V_INTR_PRIO_MASK, V_IGN_TPR:
+ * If the L2 doesn't intercept interrupts, then
+ * (even if the L2 does use virtual interrupt masking),
+ * KVM will use the vmcb02's V_INTR to detect interrupt window.
+ *
+ * In this case, the KVM raises the KVM_REQ_EVENT to ensure that interrupt window
+ * is not lost and this implicitly copies these bits from vmcb02 to vmcb01
+ *
+ * V_TPR:
+ * If the L2 doesn't use virtual interrupt masking, then the L1's vTPR
+ * is stored in the vmcb02 but its value doesn't need to be copied from/to
+ * vmcb01 because it is copied from/to the TPR APIC's register on
+ * each VM entry/exit.
+ *
+ * V_GIF:
+ * - If the nested vGIF is not used, KVM uses vmcb02's V_GIF for L1's V_GIF,
+ * however, the L1 vGIF is reset to false on each VM exit, thus
+ * there is no need to copy it from vmcb02 to vmcb01.
+ */
+
+ if (!nested_exit_on_intr(svm))
+ kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
+
if (unlikely(svm->lbrv_enabled && (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) {
svm_copy_lbrs(vmcb12, vmcb02);
svm_update_lbrv(vcpu);
--
2.26.3

2022-12-05 14:25:05

by Santosh Shukla

[permalink] [raw]

Subject: Re: [PATCH v2 01/11] KVM: nSVM: don't sync back tlb_ctl on nested VM exit

Hi Maxim,

On 11/30/2022 1:07 AM, Maxim Levitsky wrote:
> The CPU doesn't change TLB_CTL value as stated in the PRM (15.16.2):
>
nits:
s / PRM (15.16.2) / APM (15.16.1 - TLB Flush)

> "The VMRUN instruction reads, but does not change, the
> value of the TLB_CONTROL field"
>
> Therefore the KVM shouldn't do that either.
>
> Signed-off-by: Maxim Levitsky <[email protected]>
> ---
> arch/x86/kvm/svm/nested.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index bc9cd7086fa972..37af0338da7c32 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -1010,7 +1010,6 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
> vmcb12->control.next_rip = vmcb02->control.next_rip;
>
> vmcb12->control.int_ctl = svm->nested.ctl.int_ctl;
> - vmcb12->control.tlb_ctl = svm->nested.ctl.tlb_ctl;
> vmcb12->control.event_inj = svm->nested.ctl.event_inj;
> vmcb12->control.event_inj_err = svm->nested.ctl.event_inj_err;
>

2022-12-06 10:31:59

by Santosh Shukla

[permalink] [raw]

Subject: Re: [PATCH v2 00/11] SVM: vNMI (with my fixes)

On 11/30/2022 1:07 AM, Maxim Levitsky wrote:
> Hi!
>
>
>
> This is the vNMI patch series based on Santosh Shukla's vNMI patch series.
>
>
>
> In this version of this patch series I addressed most of the review feedback
>
> added some more refactoring and also I think fixed the issue with migration.
>
>
>
> I only tested this on a machine which doesn't have vNMI, so this does need
>
> some testing to ensure that nothing is broken.
>
>
>
> Best regards,
>
> Maxim Levitsky
>
>
Series tested on EPYC-v4.
Tested-By: Santosh Shukla <[email protected]>

>
> Maxim Levitsky (9):
>
> KVM: nSVM: don't sync back tlb_ctl on nested VM exit
>
> KVM: nSVM: clean up the copying of V_INTR bits from vmcb02 to vmcb12
>
> KVM: nSVM: explicitly raise KVM_REQ_EVENT on nested VM exit if L1
>
> doesn't intercept interrupts
>
> KVM: SVM: drop the SVM specific H_FLAGS
>
> KVM: x86: emulator: stop using raw host flags
>
> KVM: SVM: add wrappers to enable/disable IRET interception
>
> KVM: x86: add a delayed hardware NMI injection interface
>
> KVM: SVM: implement support for vNMI
>
> KVM: nSVM: implement support for nested VNMI
>
>
>
> Santosh Shukla (2):
>
> x86/cpu: Add CPUID feature bit for VNMI
>
> KVM: SVM: Add VNMI bit definition
>
>
>
> arch/x86/include/asm/cpufeatures.h | 1 +
>
> arch/x86/include/asm/kvm-x86-ops.h | 2 +
>
> arch/x86/include/asm/kvm_host.h | 24 +++--
>
> arch/x86/include/asm/svm.h | 7 ++
>
> arch/x86/kvm/emulate.c | 11 +--
>
> arch/x86/kvm/kvm_emulate.h | 7 +-
>
> arch/x86/kvm/smm.c | 2 -
>
> arch/x86/kvm/svm/nested.c | 102 ++++++++++++++++---
>
> arch/x86/kvm/svm/svm.c | 154 ++++++++++++++++++++++-------
>
> arch/x86/kvm/svm/svm.h | 41 +++++++-
>
> arch/x86/kvm/x86.c | 50 ++++++++--
>
> 11 files changed, 318 insertions(+), 83 deletions(-)
>
>
>

2022-12-06 12:30:56

by Maxim Levitsky

[permalink] [raw]

Subject: Re: [PATCH v2 01/11] KVM: nSVM: don't sync back tlb_ctl on nested VM exit

On Mon, 2022-12-05 at 19:35 +0530, Santosh Shukla wrote:
> Hi Maxim,
>
> On 11/30/2022 1:07 AM, Maxim Levitsky wrote:
> > The CPU doesn't change TLB_CTL value as stated in the PRM (15.16.2):
> >
> nits:
> s / PRM (15.16.2) / APM (15.16.1 - TLB Flush)

True for both changes, thanks!

Best regards,
Maxim Levitsky

>
> > "The VMRUN instruction reads, but does not change, the
> > value of the TLB_CONTROL field"
> >
> > Therefore the KVM shouldn't do that either.
> >
> > Signed-off-by: Maxim Levitsky <[email protected]>
> > ---
> > arch/x86/kvm/svm/nested.c | 1 -
> > 1 file changed, 1 deletion(-)
> >
> > diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> > index bc9cd7086fa972..37af0338da7c32 100644
> > --- a/arch/x86/kvm/svm/nested.c
> > +++ b/arch/x86/kvm/svm/nested.c
> > @@ -1010,7 +1010,6 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
> > vmcb12->control.next_rip = vmcb02->control.next_rip;
> >
> > vmcb12->control.int_ctl = svm->nested.ctl.int_ctl;
> > - vmcb12->control.tlb_ctl = svm->nested.ctl.tlb_ctl;
> > vmcb12->control.event_inj = svm->nested.ctl.event_inj;
> > vmcb12->control.event_inj_err = svm->nested.ctl.event_inj_err;
> >

2022-12-20 11:17:10

by Maxim Levitsky

[permalink] [raw]

Subject: Re: [PATCH v2 00/11] SVM: vNMI (with my fixes)

On Tue, 2022-11-29 at 21:37 +0200, Maxim Levitsky wrote:
> Hi!
>
> This is the vNMI patch series based on Santosh Shukla's vNMI patch series.
>
> In this version of this patch series I addressed most of the review feedback
> added some more refactoring and also I think fixed the issue with migration.
>
> I only tested this on a machine which doesn't have vNMI, so this does need
> some testing to ensure that nothing is broken.
>
> Best regards,
> Maxim Levitsky
>
> Maxim Levitsky (9):
> KVM: nSVM: don't sync back tlb_ctl on nested VM exit
> KVM: nSVM: clean up the copying of V_INTR bits from vmcb02 to vmcb12
> KVM: nSVM: explicitly raise KVM_REQ_EVENT on nested VM exit if L1
> doesn't intercept interrupts
> KVM: SVM: drop the SVM specific H_FLAGS
> KVM: x86: emulator: stop using raw host flags
> KVM: SVM: add wrappers to enable/disable IRET interception
> KVM: x86: add a delayed hardware NMI injection interface
> KVM: SVM: implement support for vNMI
> KVM: nSVM: implement support for nested VNMI
>
> Santosh Shukla (2):
> x86/cpu: Add CPUID feature bit for VNMI
> KVM: SVM: Add VNMI bit definition
>
> arch/x86/include/asm/cpufeatures.h | 1 +
> arch/x86/include/asm/kvm-x86-ops.h | 2 +
> arch/x86/include/asm/kvm_host.h | 24 +++--
> arch/x86/include/asm/svm.h | 7 ++
> arch/x86/kvm/emulate.c | 11 +--
> arch/x86/kvm/kvm_emulate.h | 7 +-
> arch/x86/kvm/smm.c | 2 -
> arch/x86/kvm/svm/nested.c | 102 ++++++++++++++++---
> arch/x86/kvm/svm/svm.c | 154 ++++++++++++++++++++++-------
> arch/x86/kvm/svm/svm.h | 41 +++++++-
> arch/x86/kvm/x86.c | 50 ++++++++--
> 11 files changed, 318 insertions(+), 83 deletions(-)
>
> --
> 2.26.3
>
>
A very kind ping on these patches.

Best regards,
Maxim Levitsky

2022-12-21 19:03:17

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH v2 00/11] SVM: vNMI (with my fixes)

On Tue, Dec 20, 2022, Maxim Levitsky wrote:
> On Tue, 2022-11-29 at 21:37 +0200, Maxim Levitsky wrote:
> > Hi!
> >
> > This is the vNMI patch series based on Santosh Shukla's vNMI patch series.
> >
> > In this version of this patch series I addressed most of the review feedback
> > added some more refactoring and also I think fixed the issue with migration.
> >
> > I only tested this on a machine which doesn't have vNMI, so this does need
> > some testing to ensure that nothing is broken.
> >
> > Best regards,
> > Maxim Levitsky
> >
> > Maxim Levitsky (9):
> > KVM: nSVM: don't sync back tlb_ctl on nested VM exit
> > KVM: nSVM: clean up the copying of V_INTR bits from vmcb02 to vmcb12
> > KVM: nSVM: explicitly raise KVM_REQ_EVENT on nested VM exit if L1
> > doesn't intercept interrupts
> > KVM: SVM: drop the SVM specific H_FLAGS
> > KVM: x86: emulator: stop using raw host flags
> > KVM: SVM: add wrappers to enable/disable IRET interception
> > KVM: x86: add a delayed hardware NMI injection interface
> > KVM: SVM: implement support for vNMI
> > KVM: nSVM: implement support for nested VNMI
> >
> > Santosh Shukla (2):
> > x86/cpu: Add CPUID feature bit for VNMI
> > KVM: SVM: Add VNMI bit definition
> >
> > arch/x86/include/asm/cpufeatures.h | 1 +
> > arch/x86/include/asm/kvm-x86-ops.h | 2 +
> > arch/x86/include/asm/kvm_host.h | 24 +++--
> > arch/x86/include/asm/svm.h | 7 ++
> > arch/x86/kvm/emulate.c | 11 +--
> > arch/x86/kvm/kvm_emulate.h | 7 +-
> > arch/x86/kvm/smm.c | 2 -
> > arch/x86/kvm/svm/nested.c | 102 ++++++++++++++++---
> > arch/x86/kvm/svm/svm.c | 154 ++++++++++++++++++++++-------
> > arch/x86/kvm/svm/svm.h | 41 +++++++-
> > arch/x86/kvm/x86.c | 50 ++++++++--
> > 11 files changed, 318 insertions(+), 83 deletions(-)
> >
> > --
> > 2.26.3
> >
> >
> A very kind ping on these patches.

Sorry, I won't get to this (or anything else) until the new year.

2023-01-15 09:29:59

by Maxim Levitsky

[permalink] [raw]

Subject: Re: [PATCH v2 00/11] SVM: vNMI (with my fixes)

On Tue, 2022-11-29 at 21:37 +0200, Maxim Levitsky wrote:
> Hi!
>
> This is the vNMI patch series based on Santosh Shukla's vNMI patch series.
>
> In this version of this patch series I addressed most of the review feedback
> added some more refactoring and also I think fixed the issue with migration.
>
> I only tested this on a machine which doesn't have vNMI, so this does need
> some testing to ensure that nothing is broken.
>
> Best regards,
> Maxim Levitsky
>
> Maxim Levitsky (9):
> KVM: nSVM: don't sync back tlb_ctl on nested VM exit
> KVM: nSVM: clean up the copying of V_INTR bits from vmcb02 to vmcb12
> KVM: nSVM: explicitly raise KVM_REQ_EVENT on nested VM exit if L1
> doesn't intercept interrupts
> KVM: SVM: drop the SVM specific H_FLAGS
> KVM: x86: emulator: stop using raw host flags
> KVM: SVM: add wrappers to enable/disable IRET interception
> KVM: x86: add a delayed hardware NMI injection interface
> KVM: SVM: implement support for vNMI
> KVM: nSVM: implement support for nested VNMI
>
> Santosh Shukla (2):
> x86/cpu: Add CPUID feature bit for VNMI
> KVM: SVM: Add VNMI bit definition
>
> arch/x86/include/asm/cpufeatures.h | 1 +
> arch/x86/include/asm/kvm-x86-ops.h | 2 +
> arch/x86/include/asm/kvm_host.h | 24 +++--
> arch/x86/include/asm/svm.h | 7 ++
> arch/x86/kvm/emulate.c | 11 +--
> arch/x86/kvm/kvm_emulate.h | 7 +-
> arch/x86/kvm/smm.c | 2 -
> arch/x86/kvm/svm/nested.c | 102 ++++++++++++++++---
> arch/x86/kvm/svm/svm.c | 154 ++++++++++++++++++++++-------
> arch/x86/kvm/svm/svm.h | 41 +++++++-
> arch/x86/kvm/x86.c | 50 ++++++++--
> 11 files changed, 318 insertions(+), 83 deletions(-)
>
> --
> 2.26.3
>
>
Another kind ping on this patch series.

Best regards,
Maxim Levitsky

2023-01-28 00:56:14

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH v2 03/11] KVM: nSVM: explicitly raise KVM_REQ_EVENT on nested VM exit if L1 doesn't intercept interrupts

Shortlog is too long, maybe this?

KVM: nSVM: Raise event on nested VM exit if L1 doesn't intercept IRQs

On Tue, Nov 29, 2022, Maxim Levitsky wrote:
> If the L2 doesn't intercept interrupts, then the KVM will use vmcb02's

s/the L2/L2, though don't you mean L1?

> V_IRQ for L1 (to detect the interrupt window)

"an interrupt window", i.e. there's not just one window.

> In this case on the nested VM exit KVM might need to copy the V_IRQ bit

s/the nested/nested

> from the vmcb02 to the vmcb01, to continue waiting for the
> interrupt window.
>
> To make it simple, just raise the KVM_REQ_EVENT request, which
> execution will lead to the reenabling of the interrupt
> window if needed.
>
> Note that this is a theoretical bug because the KVM already does raise

s/the KVM/KVM

> the KVM_REQ_EVENT request one each nested VM exit because the nested

s/the KVM_REQ_EVENT/KVM_REQ_EVENT, and s/one/on

> VM exit resets RFLAGS and the kvm_set_rflags() raises the
> KVM_REQ_EVENT request in the response.
>
> However raising this request explicitly, together with
> documenting why this is needed, is still preferred.
>
> Signed-off-by: Maxim Levitsky <[email protected]>
> ---
> arch/x86/kvm/svm/nested.c | 25 +++++++++++++++++++++++++
> 1 file changed, 25 insertions(+)
>
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index aad3145b2f62fe..e891318595113e 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -1016,6 +1016,31 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
>
> svm_switch_vmcb(svm, &svm->vmcb01);
>
> + /* Note about synchronizing some of int_ctl bits from vmcb02 to vmcb01:

/*
* Preferred block comment style...

> + *
> + * - V_IRQ, V_IRQ_VECTOR, V_INTR_PRIO_MASK, V_IGN_TPR:

Drop the "-" to be consistent with the rest of the paragraphs.

> + * If the L2 doesn't intercept interrupts, then
> + * (even if the L2 does use virtual interrupt masking),

KVM uses "L2" to refer to the thing running at L2. I think what you are referring
to here is vmcb12? And that's controlled by L1.

> + * KVM will use the vmcb02's V_INTR to detect interrupt window.

s/the vmcb02/vmcb02

Which of the V_INTR fields does this refer to? Oooh, you're saying the KVM injects
a virtual interrupt into L2 using vmcb02 in order to determine when L2 has IRQs
enabled.

Why does KVM do that? Why not pend the actual IRQ directly?

> + *
> + * In this case, the KVM raises the KVM_REQ_EVENT to ensure that interrupt window

s/the KVM_REQ_EVENT/KVM_REQ_EVENT

> + * is not lost and this implicitly copies these bits from vmcb02 to vmcb01

Too many pronouns. What do "this" and "these bits" refer to?

> + *
> + * V_TPR:
> + * If the L2 doesn't use virtual interrupt masking, then the L1's vTPR
> + * is stored in the vmcb02 but its value doesn't need to be copied from/to
> + * vmcb01 because it is copied from/to the TPR APIC's register on
> + * each VM entry/exit.
> + *
> + * V_GIF:
> + * - If the nested vGIF is not used, KVM uses vmcb02's V_GIF for L1's V_GIF,

Drop this "-" too.

> + * however, the L1 vGIF is reset to false on each VM exit, thus
> + * there is no need to copy it from vmcb02 to vmcb01.
> + */
> +
> + if (!nested_exit_on_intr(svm))
> + kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
> +
> if (unlikely(svm->lbrv_enabled && (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) {
> svm_copy_lbrs(vmcb12, vmcb02);
> svm_update_lbrv(vcpu);
> --
> 2.26.3
>

2023-01-28 01:13:44

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH v2 00/11] SVM: vNMI (with my fixes)

On Tue, Nov 29, 2022, Maxim Levitsky wrote:
> Hi!
>
> This is the vNMI patch series based on Santosh Shukla's vNMI patch series.
>
> In this version of this patch series I addressed most of the review feedback
> added some more refactoring and also I think fixed the issue with migration.
>
> I only tested this on a machine which doesn't have vNMI, so this does need
> some testing to ensure that nothing is broken.

Apologies for the slow review.

Did a fast run through, mostly have questions to address my lack of knowledge.
I'll give this a much more thorough review first thing next week (my brain is
fried), and am planning on queueing it unless I see someone truly busted (I'll
fixup my nits when applying).

2023-01-30 18:41:47

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH v2 03/11] KVM: nSVM: explicitly raise KVM_REQ_EVENT on nested VM exit if L1 doesn't intercept interrupts

On Sat, Jan 28, 2023, Sean Christopherson wrote:
> > + * If the L2 doesn't intercept interrupts, then
> > + * (even if the L2 does use virtual interrupt masking),
>
> KVM uses "L2" to refer to the thing running at L2. I think what you are referring
> to here is vmcb12? And that's controlled by L1.
>
> > + * KVM will use the vmcb02's V_INTR to detect interrupt window.
>
> s/the vmcb02/vmcb02
>
> Which of the V_INTR fields does this refer to? Oooh, you're saying the KVM injects
> a virtual interrupt into L2 using vmcb02 in order to determine when L2 has IRQs
> enabled.
>
> Why does KVM do that? Why not pend the actual IRQ directly?

Duh, because KVM needs to gain control in if there are multiple pending events.

2023-01-31 22:42:23

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH v2 09/11] KVM: SVM: Add VNMI bit definition

On Tue, Nov 29, 2022, Maxim Levitsky wrote:
> From: Santosh Shukla <[email protected]>
>
> VNMI exposes 3 capability bits (V_NMI, V_NMI_MASK, and V_NMI_ENABLE) to
> virtualize NMI and NMI_MASK, Those capability bits are part of
> VMCB::intr_ctrl -
> V_NMI(11) - Indicates whether a virtual NMI is pending in the guest.
> V_NMI_MASK(12) - Indicates whether virtual NMI is masked in the guest.
> V_NMI_ENABLE(26) - Enables the NMI virtualization feature for the guest.
>
> When Hypervisor wants to inject NMI, it will set V_NMI bit, Processor
> will clear the V_NMI bit and Set the V_NMI_MASK which means the Guest is
> handling NMI, After the guest handled the NMI, The processor will clear
> the V_NMI_MASK on the successful completion of IRET instruction Or if
> VMEXIT occurs while delivering the virtual NMI.
>
> To enable the VNMI capability, Hypervisor need to program
> V_NMI_ENABLE bit 1.
>
> Reviewed-by: Maxim Levitsky <[email protected]>
> Signed-off-by: Santosh Shukla <[email protected]>
> ---
> arch/x86/include/asm/svm.h | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index cb1ee53ad3b189..26d6f549ce2b46 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -203,6 +203,13 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
> #define X2APIC_MODE_SHIFT 30
> #define X2APIC_MODE_MASK (1 << X2APIC_MODE_SHIFT)
>
> +#define V_NMI_PENDING_SHIFT 11
> +#define V_NMI_PENDING (1 << V_NMI_PENDING_SHIFT)
> +#define V_NMI_MASK_SHIFT 12
> +#define V_NMI_MASK (1 << V_NMI_MASK_SHIFT)

Argh, more KVM warts. The existing INT_CTL defines all use "mask" in the name,
so looking at V_NMI_MASK in the context of other code reads "vNMI is pending",
not "vNMIs are blocked".

IMO, the existing _MASK terminology is the one that's wrong, but there's an absurd
amount of prior art in svm.h :-(

And the really annoying one is V_INTR_MASKING_MASK, which IIRC says "virtual INTR
masking is enabled", not "virtual INTRs are blocked".

So maybe call this V_NMI_BLOCKING_MASK? And tack on _MASK too the others (even
though I agree it's ugly).

> +#define V_NMI_ENABLE_SHIFT 26
> +#define V_NMI_ENABLE (1 << V_NMI_ENABLE_SHIFT)

Hrm. I think I would prefer to keep the defines ordered by bit position. Knowing
that there's an enable bit isn't all that critical for understanding vNMI pending
and blocked.

2023-02-01 00:24:27

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH v2 00/11] SVM: vNMI (with my fixes)

On Tue, Dec 06, 2022, Santosh Shukla wrote:
> Series tested on EPYC-v4.
> Tested-By: Santosh Shukla <[email protected]>

In the future, please use Tested-by, not Tested-By. For whatever reason, the
preferred kernel style for tags is to capitalize only the first word, e.g.
Co-developed-by, Tested-by, Reviewed-by, etc...

2023-02-01 20:41:01

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH v2 00/11] SVM: vNMI (with my fixes)

On Tue, 29 Nov 2022 21:37:06 +0200, Maxim Levitsky wrote:
> This is the vNMI patch series based on Santosh Shukla's vNMI patch series.
>
> In this version of this patch series I addressed most of the review feedback
> added some more refactoring and also I think fixed the issue with migration.
>
> I only tested this on a machine which doesn't have vNMI, so this does need
> some testing to ensure that nothing is broken.
>
> [...]

Applied 1, 4, and 5 to kvm-x86 svm. I split patch 4 as doing so made the
HF_GIF_MASK change super trivial.

vNMI support will get pushed beyond v6.3, but I will do my best to promptly
review future versions, while I still have all of this paged in...

[01/11] KVM: nSVM: Don't sync tlb_ctl back to vmcb12 on nested VM-Exit
https://github.com/kvm-x86/linux/commit/8957cbcfed0a
[04/11] KVM: x86: Move HF_GIF_MASK into "struct vcpu_svm" as "guest_gif"
https://github.com/kvm-x86/linux/commit/c760e86f27fe
[04/11] KVM: x86: Move HF_NMI_MASK and HF_IRET_MASK into "struct vcpu_svm"
https://github.com/kvm-x86/linux/commit/916b54a7688b
[05/11] KVM: x86: Use emulator callbacks instead of duplicating "host flags"
https://github.com/kvm-x86/linux/commit/32e69f232db4

--
https://github.com/kvm-x86/linux/tree/next
https://github.com/kvm-x86/linux/tree/fixes

2023-02-02 09:44:32

by Santosh Shukla

[permalink] [raw]

Subject: Re: [PATCH v2 09/11] KVM: SVM: Add VNMI bit definition

On 2/1/2023 4:12 AM, Sean Christopherson wrote:
> On Tue, Nov 29, 2022, Maxim Levitsky wrote:
>> From: Santosh Shukla <[email protected]>
>>
>> VNMI exposes 3 capability bits (V_NMI, V_NMI_MASK, and V_NMI_ENABLE) to
>> virtualize NMI and NMI_MASK, Those capability bits are part of
>> VMCB::intr_ctrl -
>> V_NMI(11) - Indicates whether a virtual NMI is pending in the guest.
>> V_NMI_MASK(12) - Indicates whether virtual NMI is masked in the guest.
>> V_NMI_ENABLE(26) - Enables the NMI virtualization feature for the guest.
>>
>> When Hypervisor wants to inject NMI, it will set V_NMI bit, Processor
>> will clear the V_NMI bit and Set the V_NMI_MASK which means the Guest is
>> handling NMI, After the guest handled the NMI, The processor will clear
>> the V_NMI_MASK on the successful completion of IRET instruction Or if
>> VMEXIT occurs while delivering the virtual NMI.
>>
>> To enable the VNMI capability, Hypervisor need to program
>> V_NMI_ENABLE bit 1.
>>
>> Reviewed-by: Maxim Levitsky <[email protected]>
>> Signed-off-by: Santosh Shukla <[email protected]>
>> ---
>> arch/x86/include/asm/svm.h | 7 +++++++
>> 1 file changed, 7 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
>> index cb1ee53ad3b189..26d6f549ce2b46 100644
>> --- a/arch/x86/include/asm/svm.h
>> +++ b/arch/x86/include/asm/svm.h
>> @@ -203,6 +203,13 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
>> #define X2APIC_MODE_SHIFT 30
>> #define X2APIC_MODE_MASK (1 << X2APIC_MODE_SHIFT)
>>
>> +#define V_NMI_PENDING_SHIFT 11
>> +#define V_NMI_PENDING (1 << V_NMI_PENDING_SHIFT)
>> +#define V_NMI_MASK_SHIFT 12
>> +#define V_NMI_MASK (1 << V_NMI_MASK_SHIFT)
>
> Argh, more KVM warts. The existing INT_CTL defines all use "mask" in the name,
> so looking at V_NMI_MASK in the context of other code reads "vNMI is pending",
> not "vNMIs are blocked".
>
> IMO, the existing _MASK terminology is the one that's wrong, but there's an absurd
> amount of prior art in svm.h :-(
>
> And the really annoying one is V_INTR_MASKING_MASK, which IIRC says "virtual INTR
> masking is enabled", not "virtual INTRs are blocked".
>
> So maybe call this V_NMI_BLOCKING_MASK? And tack on _MASK too the others (even
> though I agree it's ugly).
>

Sure.

>> +#define V_NMI_ENABLE_SHIFT 26
>> +#define V_NMI_ENABLE (1 << V_NMI_ENABLE_SHIFT)
>
> Hrm. I think I would prefer to keep the defines ordered by bit position. Knowing
> that there's an enable bit isn't all that critical for understanding vNMI pending
> and blocked.

Sure, Sean will include in V3.

Thanks,
Santosh