2022-08-29 10:34:52

by Santosh Shukla

[permalink] [raw]
Subject: [PATCHv4 0/8] Virtual NMI feature


Change History:

v4 (v6.0-rc3):
05 - added nmi_l1_to_l2 check (Review comment from Maciej).

v3 (rebased on eb555cb5b794f):
https://lore.kernel.org/all/[email protected]/

v2:
https://lore.kernel.org/lkml/[email protected]/T/#m4bf8a131748688fed00ab0fefdcac209a169e202

v1:
https://lore.kernel.org/all/[email protected]/

Description:
Currently, NMI is delivered to the guest using the Event Injection
mechanism [1]. The Event Injection mechanism does not block the delivery
of subsequent NMIs. So the Hypervisor needs to track the NMI delivery
and its completion(by intercepting IRET) before sending a new NMI.

Virtual NMI (VNMI) allows the hypervisor to inject the NMI into the guest
w/o using Event Injection mechanism meaning not required to track the
guest NMI and intercepting the IRET. To achieve that,
VNMI feature provides virtualized NMI and NMI_MASK capability bits in
VMCB intr_control -
V_NMI(11) - Indicates whether a virtual NMI is pending in the guest.
V_NMI_MASK(12) - Indicates whether virtual NMI is masked in the guest.
V_NMI_ENABLE(26) - Enables the NMI virtualization feature for the guest.

When Hypervisor wants to inject NMI, it will set V_NMI bit, Processor will
clear the V_NMI bit and Set the V_NMI_MASK which means the Guest is
handling NMI, After the guest handled the NMI, The processor will clear
the V_NMI_MASK on the successful completion of IRET instruction
Or if VMEXIT occurs while delivering the virtual NMI.

If NMI virtualization enabled and NMI_INTERCEPT bit is unset
then HW will exit with #INVALID exit reason.

To enable the VNMI capability, Hypervisor need to program
V_NMI_ENABLE bit 1.

The presence of this feature is indicated via the CPUID function
0x8000000A_EDX[25].

Testing -
* Used qemu's `inject_nmi` for testing.
* tested with and w/o AVIC case.
* tested with kvm-unit-test
* tested with vGIF enable and disable.
* tested nested env:
- L1+L2 using vnmi
- L1 using vnmi and L2 not


Thanks,
Santosh
[1] https://www.amd.com/system/files/TechDocs/40332.pdf - APM Vol2,
ch-15.20 - "Event Injection".

Santosh Shukla (8):
x86/cpu: Add CPUID feature bit for VNMI
KVM: SVM: Add VNMI bit definition
KVM: SVM: Add VNMI support in get/set_nmi_mask
KVM: SVM: Report NMI not allowed when Guest busy handling VNMI
KVM: SVM: Add VNMI support in inject_nmi
KVM: nSVM: implement nested VNMI
KVM: nSVM: emulate VMEXIT_INVALID case for nested VNMI
KVM: SVM: Enable VNMI feature

arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/svm.h | 7 +++
arch/x86/kvm/svm/nested.c | 32 ++++++++++++++
arch/x86/kvm/svm/svm.c | 44 ++++++++++++++++++-
arch/x86/kvm/svm/svm.h | 68 ++++++++++++++++++++++++++++++
5 files changed, 151 insertions(+), 1 deletion(-)

--
2.25.1


2022-08-29 10:35:26

by Santosh Shukla

[permalink] [raw]
Subject: [PATCHv4 4/8] KVM: SVM: Report NMI not allowed when Guest busy handling VNMI

In the VNMI case, Report NMI is not allowed when V_NMI_PENDING is set
which mean virtual NMI already pended for Guest to process while
the Guest is busy handling the current virtual NMI. The Guest
will first finish handling the current virtual NMI and then it will
take the pended event w/o vmexit.

Signed-off-by: Santosh Shukla <[email protected]>
---
v3:
- Added is_vnmi_pending_set API so to check the vnmi pending state.
- Replaced is_vnmi_mask_set check with is_vnmi_pending_set.

v2:
- Moved vnmi check after is_guest_mode() in func _nmi_blocked().
- Removed is_vnmi_mask_set check from _enable_nmi_window().
as it was a redundent check.

arch/x86/kvm/svm/svm.c | 6 ++++++
arch/x86/kvm/svm/svm.h | 10 ++++++++++
2 files changed, 16 insertions(+)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index ab5df74da626..810b93774a95 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3598,6 +3598,9 @@ bool svm_nmi_blocked(struct kvm_vcpu *vcpu)
if (is_guest_mode(vcpu) && nested_exit_on_nmi(svm))
return false;

+ if (is_vnmi_enabled(svm) && is_vnmi_pending_set(svm))
+ return true;
+
ret = (vmcb->control.int_state & SVM_INTERRUPT_SHADOW_MASK) ||
(vcpu->arch.hflags & HF_NMI_MASK);

@@ -3734,6 +3737,9 @@ static void svm_enable_nmi_window(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);

+ if (is_vnmi_enabled(svm) && is_vnmi_pending_set(svm))
+ return;
+
if ((vcpu->arch.hflags & (HF_NMI_MASK | HF_IRET_MASK)) == HF_NMI_MASK)
return; /* IRET will cause a vm exit */

diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index cc98ec7bd119..7857a89d0ec8 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -584,6 +584,16 @@ static inline void clear_vnmi_mask(struct vcpu_svm *svm)
svm->vcpu.arch.hflags &= ~HF_GIF_MASK;
}

+static inline bool is_vnmi_pending_set(struct vcpu_svm *svm)
+{
+ struct vmcb *vmcb = get_vnmi_vmcb(svm);
+
+ if (vmcb)
+ return !!(vmcb->control.int_ctl & V_NMI_PENDING);
+ else
+ return false;
+}
+
/* svm.c */
#define MSR_INVALID 0xffffffffU

--
2.25.1

2022-08-29 10:37:34

by Santosh Shukla

[permalink] [raw]
Subject: [PATCHv4 7/8] KVM: nSVM: emulate VMEXIT_INVALID case for nested VNMI

If NMI virtualization enabled and NMI_INTERCEPT is unset then next vm
entry will exit with #INVALID exit reason.

In order to emulate above (VMEXIT(#INVALID)) scenario for nested
environment, extending check for V_NMI_ENABLE, NMI_INTERCEPT bit in func
__nested_vmcb_check_controls.

Signed-off-by: Santosh Shukla <[email protected]>
---
arch/x86/kvm/svm/nested.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 3d986ec83147..9d031fadcd67 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -296,6 +296,11 @@ static bool __nested_vmcb_check_controls(struct kvm_vcpu *vcpu,
if (CC(!nested_svm_check_tlb_ctl(vcpu, control->tlb_ctl)))
return false;

+ if (CC((control->int_ctl & V_NMI_ENABLE) &&
+ !vmcb12_is_intercept(control, INTERCEPT_NMI))) {
+ return false;
+ }
+
return true;
}

--
2.25.1

2022-08-29 10:39:14

by Santosh Shukla

[permalink] [raw]
Subject: [PATCHv4 2/8] KVM: SVM: Add VNMI bit definition

VNMI exposes 3 capability bits (V_NMI, V_NMI_MASK, and V_NMI_ENABLE) to
virtualize NMI and NMI_MASK, Those capability bits are part of
VMCB::intr_ctrl -
V_NMI(11) - Indicates whether a virtual NMI is pending in the guest.
V_NMI_MASK(12) - Indicates whether virtual NMI is masked in the guest.
V_NMI_ENABLE(26) - Enables the NMI virtualization feature for the guest.

When Hypervisor wants to inject NMI, it will set V_NMI bit, Processor
will clear the V_NMI bit and Set the V_NMI_MASK which means the Guest is
handling NMI, After the guest handled the NMI, The processor will clear
the V_NMI_MASK on the successful completion of IRET instruction Or if
VMEXIT occurs while delivering the virtual NMI.

To enable the VNMI capability, Hypervisor need to program
V_NMI_ENABLE bit 1.

Reviewed-by: Maxim Levitsky <[email protected]>
Signed-off-by: Santosh Shukla <[email protected]>
---
arch/x86/include/asm/svm.h | 7 +++++++
arch/x86/kvm/svm/svm.c | 6 ++++++
2 files changed, 13 insertions(+)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 0361626841bc..73bf97e04fe3 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -198,6 +198,13 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
#define X2APIC_MODE_SHIFT 30
#define X2APIC_MODE_MASK (1 << X2APIC_MODE_SHIFT)

+#define V_NMI_PENDING_SHIFT 11
+#define V_NMI_PENDING (1 << V_NMI_PENDING_SHIFT)
+#define V_NMI_MASK_SHIFT 12
+#define V_NMI_MASK (1 << V_NMI_MASK_SHIFT)
+#define V_NMI_ENABLE_SHIFT 26
+#define V_NMI_ENABLE (1 << V_NMI_ENABLE_SHIFT)
+
#define LBR_CTL_ENABLE_MASK BIT_ULL(0)
#define VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK BIT_ULL(1)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index f3813dbacb9f..38db96121c32 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -229,6 +229,8 @@ module_param(dump_invalid_vmcb, bool, 0644);
bool intercept_smi = true;
module_param(intercept_smi, bool, 0444);

+bool vnmi = true;
+module_param(vnmi, bool, 0444);

static bool svm_gp_erratum_intercept = true;

@@ -5063,6 +5065,10 @@ static __init int svm_hardware_setup(void)
svm_x86_ops.vcpu_get_apicv_inhibit_reasons = NULL;
}

+ vnmi = vnmi && boot_cpu_has(X86_FEATURE_V_NMI);
+ if (vnmi)
+ pr_info("V_NMI enabled\n");
+
if (vls) {
if (!npt_enabled ||
!boot_cpu_has(X86_FEATURE_V_VMSAVE_VMLOAD) ||
--
2.25.1

2022-08-29 11:00:41

by Santosh Shukla

[permalink] [raw]
Subject: [PATCHv4 8/8] KVM: SVM: Enable VNMI feature

Enable the NMI virtualization (V_NMI_ENABLE) in the VMCB interrupt
control when the vnmi module parameter is set.

Signed-off-by: Santosh Shukla <[email protected]>
---
arch/x86/kvm/svm/svm.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 2e50a7ab32db..cb1ad6c6d377 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1309,6 +1309,9 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
if (kvm_vcpu_apicv_active(vcpu))
avic_init_vmcb(svm, vmcb);

+ if (vnmi)
+ svm->vmcb->control.int_ctl |= V_NMI_ENABLE;
+
if (vgif) {
svm_clr_intercept(svm, INTERCEPT_STGI);
svm_clr_intercept(svm, INTERCEPT_CLGI);
--
2.25.1

2022-10-06 18:45:03

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCHv4 0/8] Virtual NMI feature

On Mon, Aug 29, 2022, Santosh Shukla wrote:
> If NMI virtualization enabled and NMI_INTERCEPT bit is unset
> then HW will exit with #INVALID exit reason.
>
> To enable the VNMI capability, Hypervisor need to program
> V_NMI_ENABLE bit 1.
>
> The presence of this feature is indicated via the CPUID function
> 0x8000000A_EDX[25].

Until there is publicly available documentation, I am not going to review this
any further. This goes for all new features, e.g. PerfMonv2[*]. I understand
the need and desire to get code merged far in advance of hardware being available,
but y'all clearly have specs, i.e. this is a very solvable problem. Throw all the
disclaimers you want on the specs to make it abundantly clear that they are for
preview purposes or whatever, but reviewing KVM code without a spec just doesn't
work for me.

[*] https://lore.kernel.org/all/[email protected]

2022-10-10 06:06:28

by Santosh Shukla

[permalink] [raw]
Subject: Re: [PATCHv4 0/8] Virtual NMI feature



On 10/7/2022 12:10 AM, Sean Christopherson wrote:
> On Mon, Aug 29, 2022, Santosh Shukla wrote:
>> If NMI virtualization enabled and NMI_INTERCEPT bit is unset
>> then HW will exit with #INVALID exit reason.
>>
>> To enable the VNMI capability, Hypervisor need to program
>> V_NMI_ENABLE bit 1.
>>
>> The presence of this feature is indicated via the CPUID function
>> 0x8000000A_EDX[25].
>
> Until there is publicly available documentation, I am not going to review this
> any further. This goes for all new features, e.g. PerfMonv2[*]. I understand
> the need and desire to get code merged far in advance of hardware being available,
> but y'all clearly have specs, i.e. this is a very solvable problem. Throw all the
> disclaimers you want on the specs to make it abundantly clear that they are for
> preview purposes or whatever, but reviewing KVM code without a spec just doesn't
> work for me.
>

Sure Sean.

I am told that the APM should be out in the next couple of weeks.

Thanks,
Santosh

> [*] https://lore.kernel.org/all/[email protected]>

2022-10-10 16:05:25

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCHv4 0/8] Virtual NMI feature

On Mon, Oct 10, 2022, Santosh Shukla wrote:
>
>
> On 10/7/2022 12:10 AM, Sean Christopherson wrote:
> > On Mon, Aug 29, 2022, Santosh Shukla wrote:
> >> If NMI virtualization enabled and NMI_INTERCEPT bit is unset
> >> then HW will exit with #INVALID exit reason.
> >>
> >> To enable the VNMI capability, Hypervisor need to program
> >> V_NMI_ENABLE bit 1.
> >>
> >> The presence of this feature is indicated via the CPUID function
> >> 0x8000000A_EDX[25].
> >
> > Until there is publicly available documentation, I am not going to review this
> > any further. This goes for all new features, e.g. PerfMonv2[*]. I understand
> > the need and desire to get code merged far in advance of hardware being available,
> > but y'all clearly have specs, i.e. this is a very solvable problem. Throw all the
> > disclaimers you want on the specs to make it abundantly clear that they are for
> > preview purposes or whatever, but reviewing KVM code without a spec just doesn't
> > work for me.
> >
>
> Sure Sean.
>
> I am told that the APM should be out in the next couple of weeks.

Probably too late to be of much value for virtual NMI support, but for future
features, it would be very helpful to release "preview" documentation ASAP so that
we don't have to wait for the next APM update, which IIUC only happens ~2 times a
year.

2022-10-16 06:14:31

by Santosh Shukla

[permalink] [raw]
Subject: Re: [PATCHv4 0/8] Virtual NMI feature



On 10/10/2022 9:21 PM, Sean Christopherson wrote:
> On Mon, Oct 10, 2022, Santosh Shukla wrote:
>>
>>
>> On 10/7/2022 12:10 AM, Sean Christopherson wrote:
>>> On Mon, Aug 29, 2022, Santosh Shukla wrote:
>>>> If NMI virtualization enabled and NMI_INTERCEPT bit is unset
>>>> then HW will exit with #INVALID exit reason.
>>>>
>>>> To enable the VNMI capability, Hypervisor need to program
>>>> V_NMI_ENABLE bit 1.
>>>>
>>>> The presence of this feature is indicated via the CPUID function
>>>> 0x8000000A_EDX[25].
>>>
>>> Until there is publicly available documentation, I am not going to review this
>>> any further. This goes for all new features, e.g. PerfMonv2[*]. I understand
>>> the need and desire to get code merged far in advance of hardware being available,
>>> but y'all clearly have specs, i.e. this is a very solvable problem. Throw all the
>>> disclaimers you want on the specs to make it abundantly clear that they are for
>>> preview purposes or whatever, but reviewing KVM code without a spec just doesn't
>>> work for me.
>>>
>>
>> Sure Sean.
>>
>> I am told that the APM should be out in the next couple of weeks.
>
> Probably too late to be of much value for virtual NMI support, but for future
> features, it would be very helpful to release "preview" documentation ASAP so that
> we don't have to wait for the next APM update, which IIUC only happens ~2 times a
> year.

Virtual NMI spec is at [1], Chapter - 15.21.10 NMI Virtualization.

Thanks,
Santosh
[1] https://www.amd.com/en/support/tech-docs/amd64-architecture-programmers-manual-volumes-1-5