2019-08-15 16:42:28

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v2 00/15] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode

The 'commit 67034bb9dd5e ("KVM: SVM: Add irqchip_split() checks before
enabling AVIC")' was introduced to fix miscellaneous boot-hang issues
when enable AVIC. This is mainly due to AVIC hardware doest not #vmexit
on write to LAPIC EOI register resulting in-kernel PIC and IOAPIC to
wait and do not inject new interrupts (e.g. PIT).

This limits AVIC to only work with kernel_irqchip=split mode, which is
not currently enabled by default, and also required user-space to
support split irqchip model, which might not be the case.

The goal of this series is to enable AVIC to work in both irqchip modes.
by allowing AVIC to be deactivated temporary during runtime and fallback
to legacy interrupt injection mode (w/ vINTR and interrupt windows)
when needed, and then re-enabled subseqently.

Similar approach is also used to handle Hyper-V SynIC in the
'commit 5c919412fe61 ("kvm/x86: Hyper-V synthetic interrupt controller")',
where APICv is permanently disabled at runtime (currently broken for
AVIC, and fixed by this series).

In addition, currently when KVM disables APICv (e.g. running with
kernel_irqchip=on mode on AMD, or due to Hyper-V SyncIC mode), this happens
under the hood and often cause confusion to users. This will be addressed
in part 1 of this series.

This series contains three parts:
* Part 1: patch 1-4
Introduce APICv state enum and logic for keeping track of the state
for each vm, along with debugfs for checking to see if APICv is
enabled for a particular vm.

* Part 2: patch 5-11
Add support for activate/deactivate APICv at runtime

* Part 3: patch 12-15:
Provide workaround for AVIC EOI and allow enable AVIC w/
kernel_irqchip=on

Pre-requisite Patch:
* commit b9c6ff94e43a ("iommu/amd: Re-factor guest virtual APIC
(de-)activation code")
(https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git/commit/
?h=next&id=b9c6ff94e43a0ee053e0c1d983fba1ac4953b762)

This series has been tested against v5.2 as following:
* Booting Linux, FreeBSD, and Windows Server 2019 VMs upto 255 vCPUs
w/ qemu option "kernel-irqchip=on" and "-no-hpet".
* Pass-through Intel 10GbE NIC and run netperf in the VM.

Changes from V1: (https://lkml.org/lkml/2019/3/22/1042)
* Introduce APICv state enumeration
* Introduce KVM debugfs for APICv state
* Add synchronization logic for APICv state to prevent potential
race condition (per Jan's suggestion)
* Add support for activate/deactivate posted interrupt
(per Jan's suggestion)
* Remove callback functions for handling APIC ID, DFR and LDR update
(per Paolo's suggestion)
* Add workaround for handling EOI for in-kernel PIT and IOAPIC.

Suravee Suthikulpanit (15):
kvm: x86: Modify kvm_x86_ops.get_enable_apicv() to use struct kvm
parameter
kvm: x86: Introduce KVM APICv state
kvm: Add arch-specific per-VM debugfs support
kvm: x86: Add per-VM APICv state debugfs
kvm: lapic: Introduce APICv update helper function
kvm: x86: Add support for activate/de-activate APICv at runtime
kvm: x86: svm: Add support to activate/deactivate posted interrupts
svm: Add support for setup/destroy virutal APIC backing page for AVIC
svm: Add support for activate/deactivate AVIC at runtime
kvm: x86: hyperv: Use APICv deactivate request interface
svm: Temporary deactivate AVIC during ExtINT handling
kvm: i8254: Check LAPIC EOI pending when injecting irq on SVM AVIC
kvm: lapic: Clean up APIC predefined macros
kvm: ioapic: Delay update IOAPIC EOI for RTC
svm: Allow AVIC with in-kernel irqchip mode

arch/mips/kvm/mips.c | 5 ++
arch/powerpc/kvm/powerpc.c | 5 ++
arch/s390/kvm/kvm-s390.c | 5 ++
arch/x86/include/asm/kvm_host.h | 26 +++++-
arch/x86/kvm/debugfs.c | 27 +++++++
arch/x86/kvm/hyperv.c | 12 ++-
arch/x86/kvm/i8254.c | 31 +++++--
arch/x86/kvm/ioapic.c | 36 ++++++++-
arch/x86/kvm/lapic.c | 35 +++++---
arch/x86/kvm/lapic.h | 2 +
arch/x86/kvm/svm.c | 173 +++++++++++++++++++++++++++++++++++++---
arch/x86/kvm/vmx/vmx.c | 2 +-
arch/x86/kvm/x86.c | 96 +++++++++++++++++++++-
include/linux/kvm_host.h | 1 +
virt/kvm/arm/arm.c | 5 ++
virt/kvm/kvm_main.c | 2 +-
16 files changed, 421 insertions(+), 42 deletions(-)

--
1.8.3.1


2019-08-15 16:43:00

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v2 03/15] kvm: Add arch-specific per-VM debugfs support

Introduce per-VM debugfs for providing per-VM debug information.

Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/mips/kvm/mips.c | 5 +++++
arch/powerpc/kvm/powerpc.c | 5 +++++
arch/s390/kvm/kvm-s390.c | 5 +++++
arch/x86/kvm/debugfs.c | 5 +++++
include/linux/kvm_host.h | 1 +
virt/kvm/arm/arm.c | 5 +++++
virt/kvm/kvm_main.c | 2 +-
7 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index 0369f26..d8325b7 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -160,6 +160,11 @@ int kvm_arch_create_vcpu_debugfs(struct kvm_vcpu *vcpu)
return 0;
}

+int kvm_arch_create_vm_debugfs(struct kvm *kvm)
+{
+ return 0;
+}
+
void kvm_mips_free_vcpus(struct kvm *kvm)
{
unsigned int i;
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 6d704ad..44766af 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -462,6 +462,11 @@ int kvm_arch_create_vcpu_debugfs(struct kvm_vcpu *vcpu)
return 0;
}

+int kvm_arch_create_vm_debugfs(struct kvm *kvm)
+{
+ return 0;
+}
+
void kvm_arch_destroy_vm(struct kvm *kvm)
{
unsigned int i;
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 28ebd64..243b5c4 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2518,6 +2518,11 @@ int kvm_arch_create_vcpu_debugfs(struct kvm_vcpu *vcpu)
return 0;
}

+int kvm_arch_create_vm_debugfs(struct kvm *kvm)
+{
+ return 0;
+}
+
void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
{
VCPU_EVENT(vcpu, 3, "%s", "free cpu");
diff --git a/arch/x86/kvm/debugfs.c b/arch/x86/kvm/debugfs.c
index 329361b..d62852c 100644
--- a/arch/x86/kvm/debugfs.c
+++ b/arch/x86/kvm/debugfs.c
@@ -48,6 +48,11 @@ static int vcpu_get_tsc_scaling_frac_bits(void *data, u64 *val)

DEFINE_SIMPLE_ATTRIBUTE(vcpu_tsc_scaling_frac_fops, vcpu_get_tsc_scaling_frac_bits, NULL, "%llu\n");

+int kvm_arch_create_vm_debugfs(struct kvm *kvm)
+{
+ return 0;
+}
+
int kvm_arch_create_vcpu_debugfs(struct kvm_vcpu *vcpu)
{
struct dentry *ret;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index d1ad38a..ef9f176 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -862,6 +862,7 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,

bool kvm_arch_has_vcpu_debugfs(void);
int kvm_arch_create_vcpu_debugfs(struct kvm_vcpu *vcpu);
+int kvm_arch_create_vm_debugfs(struct kvm *kvm);

int kvm_arch_hardware_enable(void);
void kvm_arch_hardware_disable(void);
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index bd5c559..8812c55 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -154,6 +154,11 @@ int kvm_arch_create_vcpu_debugfs(struct kvm_vcpu *vcpu)
return 0;
}

+int kvm_arch_create_vm_debugfs(struct kvm *kvm)
+{
+ return 0;
+}
+
vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf)
{
return VM_FAULT_SIGBUS;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2f2d24a..e39db0c 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -620,7 +620,7 @@ static int kvm_create_vm_debugfs(struct kvm *kvm, int fd)
debugfs_create_file(p->name, 0644, kvm->debugfs_dentry,
stat_data, stat_fops_per_vm[p->kind]);
}
- return 0;
+ return kvm_arch_create_vm_debugfs(kvm);
}

static struct kvm *kvm_create_vm(unsigned long type)
--
1.8.3.1

2019-08-15 17:17:51

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v2 06/15] kvm: x86: Add support for activate/de-activate APICv at runtime

Certain runtime conditions require APICv to be temporary deactivated.
However, current implementation only support permanently deactivate
APICv at runtime (mainly used when running Hyper-V guest).

In addtion, for AMD, when activate / deactivate APICv during runtime,
all vcpus in the VM has to be operating in the same APICv mode, which
requires the requesting (main) vcpu to notify others.

So, introduce interfaces to request all vcpus to activate/deactivate
APICv.

Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 9 +++++
arch/x86/kvm/x86.c | 76 +++++++++++++++++++++++++++++++++++++++++
2 files changed, 85 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 04d7066..dfb7c3d 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -76,6 +76,10 @@
#define KVM_REQ_HV_STIMER KVM_ARCH_REQ(22)
#define KVM_REQ_LOAD_EOI_EXITMAP KVM_ARCH_REQ(23)
#define KVM_REQ_GET_VMCS12_PAGES KVM_ARCH_REQ(24)
+#define KVM_REQ_APICV_ACTIVATE \
+ KVM_ARCH_REQ_FLAGS(25, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
+#define KVM_REQ_APICV_DEACTIVATE \
+ KVM_ARCH_REQ_FLAGS(26, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)

#define CR0_RESERVED_BITS \
(~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
@@ -1089,6 +1093,7 @@ struct kvm_x86_ops {
void (*enable_irq_window)(struct kvm_vcpu *vcpu);
void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr);
bool (*get_enable_apicv)(struct kvm *kvm);
+ void (*pre_update_apicv_exec_ctrl)(struct kvm_vcpu *vcpu, bool activate);
void (*refresh_apicv_exec_ctrl)(struct kvm_vcpu *vcpu);
void (*hwapic_irr_update)(struct kvm_vcpu *vcpu, int max_irr);
void (*hwapic_isr_update)(struct kvm_vcpu *vcpu, int isr);
@@ -1552,6 +1557,10 @@ int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low,

void kvm_make_mclock_inprogress_request(struct kvm *kvm);
void kvm_make_scan_ioapic_request(struct kvm *kvm);
+void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu);
+void kvm_vcpu_activate_apicv(struct kvm_vcpu *vcpu);
+void kvm_make_apicv_activate_request(struct kvm_vcpu *vcpu);
+void kvm_make_apicv_deactivate_request(struct kvm_vcpu *vcpu, bool disable);

void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu,
struct kvm_async_pf *work);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f9c3f63..40a20bf 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -26,6 +26,7 @@
#include "cpuid.h"
#include "pmu.h"
#include "hyperv.h"
+#include "lapic.h"

#include <linux/clocksource.h>
#include <linux/interrupt.h>
@@ -7163,6 +7164,22 @@ static void kvm_pv_kick_cpu_op(struct kvm *kvm, unsigned long flags, int apicid)
kvm_irq_delivery_to_apic(kvm, NULL, &lapic_irq, NULL);
}

+void kvm_vcpu_activate_apicv(struct kvm_vcpu *vcpu)
+{
+ if (!lapic_in_kernel(vcpu)) {
+ WARN_ON_ONCE(!vcpu->arch.apicv_active);
+ return;
+ }
+ if (vcpu->arch.apicv_active)
+ return;
+
+ vcpu->arch.apicv_active = true;
+ kvm_apic_update_apicv(vcpu);
+
+ kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
+}
+EXPORT_SYMBOL_GPL(kvm_vcpu_activate_apicv);
+
void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
{
if (!lapic_in_kernel(vcpu)) {
@@ -7173,8 +7190,11 @@ void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
return;

vcpu->arch.apicv_active = false;
+ kvm_apic_update_apicv(vcpu);
+
kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
}
+EXPORT_SYMBOL_GPL(kvm_vcpu_deactivate_apicv);

int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
{
@@ -7668,6 +7688,58 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
kvm_make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
}

+void kvm_make_apicv_activate_request(struct kvm_vcpu *vcpu)
+{
+ int i;
+ struct kvm_vcpu *v;
+ struct kvm *kvm = vcpu->kvm;
+
+ mutex_lock(&kvm->arch.apicv_lock);
+ if (kvm->arch.apicv_state != APICV_DEACTIVATED) {
+ mutex_unlock(&kvm->arch.apicv_lock);
+ return;
+ }
+
+ kvm_for_each_vcpu(i, v, kvm)
+ kvm_clear_request(KVM_REQ_APICV_DEACTIVATE, v);
+
+ if (kvm_x86_ops->pre_update_apicv_exec_ctrl)
+ kvm_x86_ops->pre_update_apicv_exec_ctrl(vcpu, true);
+
+ kvm->arch.apicv_state = APICV_ACTIVATED;
+
+ kvm_make_all_cpus_request(kvm, KVM_REQ_APICV_ACTIVATE);
+
+ mutex_unlock(&kvm->arch.apicv_lock);
+}
+EXPORT_SYMBOL_GPL(kvm_make_apicv_activate_request);
+
+void kvm_make_apicv_deactivate_request(struct kvm_vcpu *vcpu, bool disable)
+{
+ int i;
+ struct kvm_vcpu *v;
+ struct kvm *kvm = vcpu->kvm;
+
+ mutex_lock(&kvm->arch.apicv_lock);
+ if (kvm->arch.apicv_state == APICV_DEACTIVATED) {
+ mutex_unlock(&kvm->arch.apicv_lock);
+ return;
+ }
+
+ kvm_for_each_vcpu(i, v, kvm)
+ kvm_clear_request(KVM_REQ_APICV_ACTIVATE, v);
+
+ if (kvm_x86_ops->pre_update_apicv_exec_ctrl)
+ kvm_x86_ops->pre_update_apicv_exec_ctrl(vcpu, false);
+
+ kvm->arch.apicv_state = disable ? APICV_DISABLED : APICV_DEACTIVATED;
+
+ kvm_make_all_cpus_request(kvm, KVM_REQ_APICV_DEACTIVATE);
+
+ mutex_unlock(&kvm->arch.apicv_lock);
+}
+EXPORT_SYMBOL_GPL(kvm_make_apicv_deactivate_request);
+
static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
{
if (!kvm_apic_present(vcpu))
@@ -7854,6 +7926,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
*/
if (kvm_check_request(KVM_REQ_HV_STIMER, vcpu))
kvm_hv_process_stimers(vcpu);
+ if (kvm_check_request(KVM_REQ_APICV_ACTIVATE, vcpu))
+ kvm_vcpu_activate_apicv(vcpu);
+ if (kvm_check_request(KVM_REQ_APICV_DEACTIVATE, vcpu))
+ kvm_vcpu_deactivate_apicv(vcpu);
}

if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
--
1.8.3.1

2019-08-15 17:19:05

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v2 02/15] kvm: x86: Introduce KVM APICv state

Currently, after a VM boots with APICv enabled, it could go into
the following states:
* activated = VM is running w/ APICv
* deactivated = VM deactivate APICv (temporary)
* disabled = VM deactivate APICv (permanent)

Introduce KVM APICv state enum to help keep track of the APICv states
along with a new variable struct kvm_arch.apicv_state to store
the current state.

Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 11 +++++++++++
arch/x86/kvm/x86.c | 14 +++++++++++++-
2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 56bc702..04d7066 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -845,6 +845,15 @@ enum kvm_irqchip_mode {
KVM_IRQCHIP_SPLIT, /* created with KVM_CAP_SPLIT_IRQCHIP */
};

+/*
+ * KVM assumes all vcpus in a VM operate in the same mode.
+ */
+enum kvm_apicv_state {
+ APICV_DISABLED, /* Disabled (such as for Hyper-V case) */
+ APICV_DEACTIVATED, /* Deactivated tempoerary */
+ APICV_ACTIVATED, /* Default status when APICV is enabled */
+};
+
struct kvm_arch {
unsigned long n_used_mmu_pages;
unsigned long n_requested_mmu_pages;
@@ -873,6 +882,8 @@ struct kvm_arch {
struct kvm_apic_map *apic_map;

bool apic_access_page_done;
+ struct mutex apicv_lock;
+ enum kvm_apicv_state apicv_state;

gpa_t wall_clock;

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7daf0dd..f9c3f63 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4584,6 +4584,8 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
kvm->arch.irqchip_mode = KVM_IRQCHIP_SPLIT;
kvm->arch.nr_reserved_ioapic_pins = cap->args[0];
r = 0;
+ if (kvm_x86_ops->get_enable_apicv(kvm))
+ kvm->arch.apicv_state = APICV_ACTIVATED;
split_irqchip_unlock:
mutex_unlock(&kvm->lock);
break;
@@ -4701,6 +4703,8 @@ long kvm_arch_vm_ioctl(struct file *filp,
/* Write kvm->irq_routing before enabling irqchip_in_kernel. */
smp_wmb();
kvm->arch.irqchip_mode = KVM_IRQCHIP_KERNEL;
+ if (kvm_x86_ops->get_enable_apicv(kvm))
+ kvm->arch.apicv_state = APICV_ACTIVATED;
create_irqchip_unlock:
mutex_unlock(&kvm->lock);
break;
@@ -9150,13 +9154,18 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
goto fail_free_pio_data;

if (irqchip_in_kernel(vcpu->kvm)) {
- vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv(vcpu->kvm);
r = kvm_create_lapic(vcpu, lapic_timer_advance_ns);
if (r < 0)
goto fail_mmu_destroy;
} else
static_key_slow_inc(&kvm_no_apic_vcpu);

+ mutex_lock(&vcpu->kvm->arch.apicv_lock);
+ if (irqchip_in_kernel(vcpu->kvm) &&
+ vcpu->kvm->arch.apicv_state == APICV_ACTIVATED)
+ vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv(vcpu->kvm);
+ mutex_unlock(&vcpu->kvm->arch.apicv_lock);
+
vcpu->arch.mce_banks = kzalloc(KVM_MAX_MCE_BANKS * sizeof(u64) * 4,
GFP_KERNEL_ACCOUNT);
if (!vcpu->arch.mce_banks) {
@@ -9255,6 +9264,9 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
kvm_page_track_init(kvm);
kvm_mmu_init_vm(kvm);

+ /* APICV initialization */
+ mutex_init(&kvm->arch.apicv_lock);
+
if (kvm_x86_ops->vm_init)
return kvm_x86_ops->vm_init(kvm);

--
1.8.3.1

2019-08-15 17:29:43

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v2 10/15] kvm: x86: hyperv: Use APICv deactivate request interface

Since disabling APICv has to be done for all vcpus on AMD-based system,
adopt the newly introduced kvm_make_apicv_deactivate_request() intereface.

Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/hyperv.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index a39e38f..4f71a39 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -772,9 +772,17 @@ int kvm_hv_activate_synic(struct kvm_vcpu *vcpu, bool dont_zero_synic_pages)

/*
* Hyper-V SynIC auto EOI SINT's are
- * not compatible with APICV, so deactivate APICV
+ * not compatible with APICV, so request
+ * to deactivate APICV permanently.
+ *
+ * Since this requires updating
+ * APIC_ACCESS_PAGE_PRIVATE_MEMSLOT,
+ * also take srcu lock.
*/
- kvm_vcpu_deactivate_apicv(vcpu);
+ vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
+ kvm_make_apicv_deactivate_request(vcpu, true);
+ srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
+
synic->active = true;
synic->dont_zero_synic_pages = dont_zero_synic_pages;
return 0;
--
1.8.3.1

2019-08-15 17:29:58

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v2 11/15] svm: Temporary deactivate AVIC during ExtINT handling

AMD AVIC does not support ExtINT. Therefore, AVIC must be temporary
deactivated and fall back to using legacy interrupt injection via vINTR
and interrupt window.

Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm.c | 49 +++++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 45 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index cfa4b13..4690351 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -384,6 +384,7 @@ struct amd_svm_iommu_ir {
static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
static void svm_flush_tlb(struct kvm_vcpu *vcpu, bool invalidate_gpa);
static void svm_complete_interrupts(struct vcpu_svm *svm);
+static void svm_request_activate_avic(struct kvm_vcpu *vcpu);
static bool svm_get_enable_apicv(struct kvm *kvm);
static inline void avic_post_state_restore(struct kvm_vcpu *vcpu);

@@ -4494,6 +4495,15 @@ static int interrupt_window_interception(struct vcpu_svm *svm)
{
kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
svm_clear_vintr(svm);
+
+ /*
+ * For AVIC, the only reason to end up here is ExtINTs.
+ * In this case AVIC was temporarily disabled for
+ * requesting the IRQ window and we have to re-enable it.
+ */
+ if (svm_get_enable_apicv(svm->vcpu.kvm))
+ svm_request_activate_avic(&svm->vcpu);
+
svm->vmcb->control.int_ctl &= ~V_IRQ_MASK;
mark_dirty(svm->vmcb, VMCB_INTR);
++svm->vcpu.stat.irq_window_exits;
@@ -5181,7 +5191,33 @@ static void svm_hwapic_isr_update(struct kvm_vcpu *vcpu, int max_isr)
{
}

-/* Note: Currently only used by Hyper-V. */
+static bool is_avic_active(struct vcpu_svm *svm)
+{
+ return (svm_get_enable_apicv(svm->vcpu.kvm) &&
+ svm->vmcb->control.int_ctl & AVIC_ENABLE_MASK);
+}
+
+static void svm_request_activate_avic(struct kvm_vcpu *vcpu)
+{
+ struct vcpu_svm *svm = to_svm(vcpu);
+
+ if (!lapic_in_kernel(vcpu) || is_avic_active(svm))
+ return;
+
+ kvm_make_apicv_activate_request(vcpu);
+}
+
+static void svm_request_deactivate_avic(struct kvm_vcpu *vcpu)
+{
+ struct vcpu_svm *svm = to_svm(vcpu);
+
+ if (!lapic_in_kernel(vcpu) || !is_avic_active(svm))
+ return;
+
+ /* Request temporary deactivate apicv */
+ kvm_make_apicv_deactivate_request(vcpu, false);
+}
+
static void svm_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
@@ -5522,9 +5558,6 @@ static void enable_irq_window(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);

- if (kvm_vcpu_apicv_active(vcpu))
- return;
-
/*
* In case GIF=0 we can't rely on the CPU to tell us when GIF becomes
* 1, because that's a separate STGI/VMRUN intercept. The next time we
@@ -5534,6 +5567,14 @@ static void enable_irq_window(struct kvm_vcpu *vcpu)
* window under the assumption that the hardware will set the GIF.
*/
if ((vgif_enabled(svm) || gif_set(svm)) && nested_svm_intr(svm)) {
+ /*
+ * IRQ window is not needed when AVIC is enabled,
+ * unless we have pending ExtINT since it cannot be injected
+ * via AVIC. In such case, we need to temporarily disable AVIC,
+ * and fallback to injecting IRQ via V_IRQ.
+ */
+ if (kvm_vcpu_apicv_active(vcpu))
+ svm_request_deactivate_avic(&svm->vcpu);
svm_set_vintr(svm);
svm_inject_irq(svm, 0x0);
}
--
1.8.3.1

2019-08-19 09:50:42

by Alexander Graf

[permalink] [raw]
Subject: Re: [PATCH v2 02/15] kvm: x86: Introduce KVM APICv state



On 15.08.19 18:25, Suthikulpanit, Suravee wrote:
> Currently, after a VM boots with APICv enabled, it could go into
> the following states:
> * activated = VM is running w/ APICv
> * deactivated = VM deactivate APICv (temporary)
> * disabled = VM deactivate APICv (permanent)
>
> Introduce KVM APICv state enum to help keep track of the APICv states
> along with a new variable struct kvm_arch.apicv_state to store
> the current state.
>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/include/asm/kvm_host.h | 11 +++++++++++
> arch/x86/kvm/x86.c | 14 +++++++++++++-
> 2 files changed, 24 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 56bc702..04d7066 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -845,6 +845,15 @@ enum kvm_irqchip_mode {
> KVM_IRQCHIP_SPLIT, /* created with KVM_CAP_SPLIT_IRQCHIP */
> };
>
> +/*
> + * KVM assumes all vcpus in a VM operate in the same mode.
> + */
> +enum kvm_apicv_state {
> + APICV_DISABLED, /* Disabled (such as for Hyper-V case) */
> + APICV_DEACTIVATED, /* Deactivated tempoerary */

typo

I'm also not sure the name is 100% obvious. How about something like
"suspended" or "paused"?

> + APICV_ACTIVATED, /* Default status when APICV is enabled */
> +};
> +
> struct kvm_arch {
> unsigned long n_used_mmu_pages;
> unsigned long n_requested_mmu_pages;
> @@ -873,6 +882,8 @@ struct kvm_arch {
> struct kvm_apic_map *apic_map;
>
> bool apic_access_page_done;
> + struct mutex apicv_lock;
> + enum kvm_apicv_state apicv_state;
>
> gpa_t wall_clock;
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 7daf0dd..f9c3f63 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4584,6 +4584,8 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
> kvm->arch.irqchip_mode = KVM_IRQCHIP_SPLIT;
> kvm->arch.nr_reserved_ioapic_pins = cap->args[0];
> r = 0;
> + if (kvm_x86_ops->get_enable_apicv(kvm))
> + kvm->arch.apicv_state = APICV_ACTIVATED;
> split_irqchip_unlock:
> mutex_unlock(&kvm->lock);
> break;
> @@ -4701,6 +4703,8 @@ long kvm_arch_vm_ioctl(struct file *filp,
> /* Write kvm->irq_routing before enabling irqchip_in_kernel. */
> smp_wmb();
> kvm->arch.irqchip_mode = KVM_IRQCHIP_KERNEL;
> + if (kvm_x86_ops->get_enable_apicv(kvm))
> + kvm->arch.apicv_state = APICV_ACTIVATED;
> create_irqchip_unlock:
> mutex_unlock(&kvm->lock);
> break;
> @@ -9150,13 +9154,18 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
> goto fail_free_pio_data;
>
> if (irqchip_in_kernel(vcpu->kvm)) {
> - vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv(vcpu->kvm);

Why are you moving this into a locked section?

> r = kvm_create_lapic(vcpu, lapic_timer_advance_ns);
> if (r < 0)
> goto fail_mmu_destroy;
> } else
> static_key_slow_inc(&kvm_no_apic_vcpu);
>
> + mutex_lock(&vcpu->kvm->arch.apicv_lock);
> + if (irqchip_in_kernel(vcpu->kvm) &&
> + vcpu->kvm->arch.apicv_state == APICV_ACTIVATED)
> + vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv(vcpu->kvm);
> + mutex_unlock(&vcpu->kvm->arch.apicv_lock);
> +
> vcpu->arch.mce_banks = kzalloc(KVM_MAX_MCE_BANKS * sizeof(u64) * 4,
> GFP_KERNEL_ACCOUNT);
> if (!vcpu->arch.mce_banks) {
> @@ -9255,6 +9264,9 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
> kvm_page_track_init(kvm);
> kvm_mmu_init_vm(kvm);
>
> + /* APICV initialization */
> + mutex_init(&kvm->arch.apicv_lock);

In fact, the whole lock story is not part of the patch description :).


Alex

> +
> if (kvm_x86_ops->vm_init)
> return kvm_x86_ops->vm_init(kvm);
>
>

2019-08-19 10:32:49

by Alexander Graf

[permalink] [raw]
Subject: Re: [PATCH v2 10/15] kvm: x86: hyperv: Use APICv deactivate request interface



On 15.08.19 18:25, Suthikulpanit, Suravee wrote:
> Since disabling APICv has to be done for all vcpus on AMD-based system,
> adopt the newly introduced kvm_make_apicv_deactivate_request() intereface.

typo

>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/kvm/hyperv.c | 12 ++++++++++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index a39e38f..4f71a39 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -772,9 +772,17 @@ int kvm_hv_activate_synic(struct kvm_vcpu *vcpu, bool dont_zero_synic_pages)
>
> /*
> * Hyper-V SynIC auto EOI SINT's are
> - * not compatible with APICV, so deactivate APICV
> + * not compatible with APICV, so request

double space

> + * to deactivate APICV permanently.
> + *
> + * Since this requires updating
> + * APIC_ACCESS_PAGE_PRIVATE_MEMSLOT,
> + * also take srcu lock.
> */
> - kvm_vcpu_deactivate_apicv(vcpu);
> + vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
> + kvm_make_apicv_deactivate_request(vcpu, true);
> + srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);

Overall, I'm not terribly happy with the srcu locks. Can't we handle the
memslot changes outside of the lock region, inside the respective
request handlers somehow?


Alex

> +
> synic->active = true;
> synic->dont_zero_synic_pages = dont_zero_synic_pages;
> return 0;
>

2019-08-19 10:38:50

by Alexander Graf

[permalink] [raw]
Subject: Re: [PATCH v2 11/15] svm: Temporary deactivate AVIC during ExtINT handling



On 15.08.19 18:25, Suthikulpanit, Suravee wrote:
> AMD AVIC does not support ExtINT. Therefore, AVIC must be temporary
> deactivated and fall back to using legacy interrupt injection via vINTR
> and interrupt window.
>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/kvm/svm.c | 49 +++++++++++++++++++++++++++++++++++++++++++++----
> 1 file changed, 45 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index cfa4b13..4690351 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -384,6 +384,7 @@ struct amd_svm_iommu_ir {
> static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
> static void svm_flush_tlb(struct kvm_vcpu *vcpu, bool invalidate_gpa);
> static void svm_complete_interrupts(struct vcpu_svm *svm);
> +static void svm_request_activate_avic(struct kvm_vcpu *vcpu);
> static bool svm_get_enable_apicv(struct kvm *kvm);
> static inline void avic_post_state_restore(struct kvm_vcpu *vcpu);
>
> @@ -4494,6 +4495,15 @@ static int interrupt_window_interception(struct vcpu_svm *svm)
> {
> kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
> svm_clear_vintr(svm);
> +
> + /*
> + * For AVIC, the only reason to end up here is ExtINTs.
> + * In this case AVIC was temporarily disabled for
> + * requesting the IRQ window and we have to re-enable it.
> + */
> + if (svm_get_enable_apicv(svm->vcpu.kvm))
> + svm_request_activate_avic(&svm->vcpu);

Would it make sense to add a trace point here and to the other call
sites, so that it becomes obvious in a trace when and why exactly avic
was active/inactive?

The trace point could add additional information on the why.

> +
> svm->vmcb->control.int_ctl &= ~V_IRQ_MASK;
> mark_dirty(svm->vmcb, VMCB_INTR);
> ++svm->vcpu.stat.irq_window_exits;
> @@ -5181,7 +5191,33 @@ static void svm_hwapic_isr_update(struct kvm_vcpu *vcpu, int max_isr)
> {
> }
>
> -/* Note: Currently only used by Hyper-V. */
> +static bool is_avic_active(struct vcpu_svm *svm)
> +{
> + return (svm_get_enable_apicv(svm->vcpu.kvm) &&
> + svm->vmcb->control.int_ctl & AVIC_ENABLE_MASK);
> +}
> +
> +static void svm_request_activate_avic(struct kvm_vcpu *vcpu)
> +{
> + struct vcpu_svm *svm = to_svm(vcpu);
> +
> + if (!lapic_in_kernel(vcpu) || is_avic_active(svm))
> + return;
> +
> + kvm_make_apicv_activate_request(vcpu);
> +}
> +
> +static void svm_request_deactivate_avic(struct kvm_vcpu *vcpu)
> +{
> + struct vcpu_svm *svm = to_svm(vcpu);
> +
> + if (!lapic_in_kernel(vcpu) || !is_avic_active(svm))
> + return;
> +
> + /* Request temporary deactivate apicv */
> + kvm_make_apicv_deactivate_request(vcpu, false);
> +}
> +
> static void svm_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
> {
> struct vcpu_svm *svm = to_svm(vcpu);
> @@ -5522,9 +5558,6 @@ static void enable_irq_window(struct kvm_vcpu *vcpu)
> {
> struct vcpu_svm *svm = to_svm(vcpu);
>
> - if (kvm_vcpu_apicv_active(vcpu))
> - return;
> -
> /*
> * In case GIF=0 we can't rely on the CPU to tell us when GIF becomes
> * 1, because that's a separate STGI/VMRUN intercept. The next time we
> @@ -5534,6 +5567,14 @@ static void enable_irq_window(struct kvm_vcpu *vcpu)
> * window under the assumption that the hardware will set the GIF.
> */
> if ((vgif_enabled(svm) || gif_set(svm)) && nested_svm_intr(svm)) {
> + /*
> + * IRQ window is not needed when AVIC is enabled,
> + * unless we have pending ExtINT since it cannot be injected
> + * via AVIC. In such case, we need to temporarily disable AVIC,
> + * and fallback to injecting IRQ via V_IRQ.
> + */
> + if (kvm_vcpu_apicv_active(vcpu))
> + svm_request_deactivate_avic(&svm->vcpu);

Did you test AVIC with nesting? Did you actually run across this issue
there?


Alex

> svm_set_vintr(svm);
> svm_inject_irq(svm, 0x0);
> }
>

2019-08-26 20:45:10

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [PATCH v2 02/15] kvm: x86: Introduce KVM APICv state

Alex,

On 8/19/2019 4:49 AM, Alexander Graf wrote:
>
>
> On 15.08.19 18:25, Suthikulpanit, Suravee wrote:
>> Currently, after a VM boots with APICv enabled, it could go into
>> the following states:
>>    * activated   = VM is running w/ APICv
>>    * deactivated = VM deactivate APICv (temporary)
>>    * disabled    = VM deactivate APICv (permanent)
>>
>> Introduce KVM APICv state enum to help keep track of the APICv states
>> along with a new variable struct kvm_arch.apicv_state to store
>> the current state.
>>
>> Signed-off-by: Suravee Suthikulpanit <[email protected]>
>> ---
>>   arch/x86/include/asm/kvm_host.h | 11 +++++++++++
>>   arch/x86/kvm/x86.c              | 14 +++++++++++++-
>>   2 files changed, 24 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index 56bc702..04d7066 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -845,6 +845,15 @@ enum kvm_irqchip_mode {
>>       KVM_IRQCHIP_SPLIT,        /* created with KVM_CAP_SPLIT_IRQCHIP */
>>   };
>> +/*
>> + * KVM assumes all vcpus in a VM operate in the same mode.
>> + */
>> +enum kvm_apicv_state {
>> +    APICV_DISABLED,        /* Disabled (such as for Hyper-V case) */
>> +    APICV_DEACTIVATED,    /* Deactivated tempoerary */
>
> typo
>
> I'm also not sure the name is 100% obvious. How about something like "suspended" or "paused"?

Ok, I'll change it to APICV_SUSPENDED.

>> ...
>> @@ -9150,13 +9154,18 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
>>           goto fail_free_pio_data;
>>       if (irqchip_in_kernel(vcpu->kvm)) {
>> -        vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv(vcpu->kvm);
>
> Why are you moving this into a locked section?

Since we introduced the apicv_state to track the VM APICv state, which is accessible
by each vcpu initialization code, we need to lock and check the state before setting
the per-vcpu apicv_active.

>
>>           r = kvm_create_lapic(vcpu, lapic_timer_advance_ns);
>>           if (r < 0)
>>               goto fail_mmu_destroy;
>>       } else
>>           static_key_slow_inc(&kvm_no_apic_vcpu);
>> +    mutex_lock(&vcpu->kvm->arch.apicv_lock);
>> +    if (irqchip_in_kernel(vcpu->kvm) &&
>> +        vcpu->kvm->arch.apicv_state == APICV_ACTIVATED)
>> +        vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv(vcpu->kvm);
>> +    mutex_unlock(&vcpu->kvm->arch.apicv_lock);
>> +
>>       vcpu->arch.mce_banks = kzalloc(KVM_MAX_MCE_BANKS * sizeof(u64) * 4,
>>                          GFP_KERNEL_ACCOUNT);
>>       if (!vcpu->arch.mce_banks) {
>> @@ -9255,6 +9264,9 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>>       kvm_page_track_init(kvm);
>>       kvm_mmu_init_vm(kvm);
>> +    /* APICV initialization */
>> +    mutex_init(&kvm->arch.apicv_lock);
>
> In fact, the whole lock story is not part of the patch description :).\\

Ok, I'll update the commit log to describe the lock .

Suravee

2019-08-27 07:31:17

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v2 06/15] kvm: x86: Add support for activate/de-activate APICv at runtime

"Suthikulpanit, Suravee" <[email protected]> writes:

> Certain runtime conditions require APICv to be temporary deactivated.
> However, current implementation only support permanently deactivate
> APICv at runtime (mainly used when running Hyper-V guest).
>
> In addtion, for AMD, when activate / deactivate APICv during runtime,
> all vcpus in the VM has to be operating in the same APICv mode, which
> requires the requesting (main) vcpu to notify others.
>
> So, introduce interfaces to request all vcpus to activate/deactivate
> APICv.
>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/include/asm/kvm_host.h | 9 +++++
> arch/x86/kvm/x86.c | 76 +++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 85 insertions(+)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 04d7066..dfb7c3d 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -76,6 +76,10 @@
> #define KVM_REQ_HV_STIMER KVM_ARCH_REQ(22)
> #define KVM_REQ_LOAD_EOI_EXITMAP KVM_ARCH_REQ(23)
> #define KVM_REQ_GET_VMCS12_PAGES KVM_ARCH_REQ(24)
> +#define KVM_REQ_APICV_ACTIVATE \
> + KVM_ARCH_REQ_FLAGS(25, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
> +#define KVM_REQ_APICV_DEACTIVATE \
> + KVM_ARCH_REQ_FLAGS(26, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
>
> #define CR0_RESERVED_BITS \
> (~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
> @@ -1089,6 +1093,7 @@ struct kvm_x86_ops {
> void (*enable_irq_window)(struct kvm_vcpu *vcpu);
> void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr);
> bool (*get_enable_apicv)(struct kvm *kvm);
> + void (*pre_update_apicv_exec_ctrl)(struct kvm_vcpu *vcpu, bool activate);
> void (*refresh_apicv_exec_ctrl)(struct kvm_vcpu *vcpu);
> void (*hwapic_irr_update)(struct kvm_vcpu *vcpu, int max_irr);
> void (*hwapic_isr_update)(struct kvm_vcpu *vcpu, int isr);
> @@ -1552,6 +1557,10 @@ int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low,
>
> void kvm_make_mclock_inprogress_request(struct kvm *kvm);
> void kvm_make_scan_ioapic_request(struct kvm *kvm);
> +void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu);
> +void kvm_vcpu_activate_apicv(struct kvm_vcpu *vcpu);
> +void kvm_make_apicv_activate_request(struct kvm_vcpu *vcpu);
> +void kvm_make_apicv_deactivate_request(struct kvm_vcpu *vcpu, bool disable);
>
> void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu,
> struct kvm_async_pf *work);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index f9c3f63..40a20bf 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -26,6 +26,7 @@
> #include "cpuid.h"
> #include "pmu.h"
> #include "hyperv.h"
> +#include "lapic.h"
>
> #include <linux/clocksource.h>
> #include <linux/interrupt.h>
> @@ -7163,6 +7164,22 @@ static void kvm_pv_kick_cpu_op(struct kvm *kvm, unsigned long flags, int apicid)
> kvm_irq_delivery_to_apic(kvm, NULL, &lapic_irq, NULL);
> }
>
> +void kvm_vcpu_activate_apicv(struct kvm_vcpu *vcpu)
> +{
> + if (!lapic_in_kernel(vcpu)) {
> + WARN_ON_ONCE(!vcpu->arch.apicv_active);
> + return;
> + }
> + if (vcpu->arch.apicv_active)
> + return;
> +
> + vcpu->arch.apicv_active = true;
> + kvm_apic_update_apicv(vcpu);
> +
> + kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
> +}
> +EXPORT_SYMBOL_GPL(kvm_vcpu_activate_apicv);
> +
> void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
> {
> if (!lapic_in_kernel(vcpu)) {
> @@ -7173,8 +7190,11 @@ void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
> return;
>
> vcpu->arch.apicv_active = false;
> + kvm_apic_update_apicv(vcpu);
> +
> kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
> }
> +EXPORT_SYMBOL_GPL(kvm_vcpu_deactivate_apicv);
>
> int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
> {
> @@ -7668,6 +7688,58 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
> kvm_make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
> }
>
> +void kvm_make_apicv_activate_request(struct kvm_vcpu *vcpu)
> +{
> + int i;
> + struct kvm_vcpu *v;
> + struct kvm *kvm = vcpu->kvm;
> +
> + mutex_lock(&kvm->arch.apicv_lock);
> + if (kvm->arch.apicv_state != APICV_DEACTIVATED) {
> + mutex_unlock(&kvm->arch.apicv_lock);
> + return;
> + }
> +
> + kvm_for_each_vcpu(i, v, kvm)
> + kvm_clear_request(KVM_REQ_APICV_DEACTIVATE, v);
> +
> + if (kvm_x86_ops->pre_update_apicv_exec_ctrl)
> + kvm_x86_ops->pre_update_apicv_exec_ctrl(vcpu, true);
> +
> + kvm->arch.apicv_state = APICV_ACTIVATED;
> +
> + kvm_make_all_cpus_request(kvm, KVM_REQ_APICV_ACTIVATE);
> +
> + mutex_unlock(&kvm->arch.apicv_lock);
> +}
> +EXPORT_SYMBOL_GPL(kvm_make_apicv_activate_request);
> +
> +void kvm_make_apicv_deactivate_request(struct kvm_vcpu *vcpu, bool disable)
> +{
> + int i;
> + struct kvm_vcpu *v;
> + struct kvm *kvm = vcpu->kvm;
> +
> + mutex_lock(&kvm->arch.apicv_lock);
> + if (kvm->arch.apicv_state == APICV_DEACTIVATED) {
> + mutex_unlock(&kvm->arch.apicv_lock);
> + return;
> + }
> +
> + kvm_for_each_vcpu(i, v, kvm)
> + kvm_clear_request(KVM_REQ_APICV_ACTIVATE, v);

Could you please elaborate on when we need to eat the
KVM_REQ_APICV_ACTIVATE request here (and KVM_REQ_APICV_DEACTIVATE in
kvm_make_apicv_activate_request() respectively)? To me, this looks like
a possible source of hard-to-debug problems in the future.

> +
> + if (kvm_x86_ops->pre_update_apicv_exec_ctrl)
> + kvm_x86_ops->pre_update_apicv_exec_ctrl(vcpu, false);
> +
> + kvm->arch.apicv_state = disable ? APICV_DISABLED : APICV_DEACTIVATED;
> +
> + kvm_make_all_cpus_request(kvm, KVM_REQ_APICV_DEACTIVATE);
> +
> + mutex_unlock(&kvm->arch.apicv_lock);
> +}
> +EXPORT_SYMBOL_GPL(kvm_make_apicv_deactivate_request);
> +
> static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
> {
> if (!kvm_apic_present(vcpu))
> @@ -7854,6 +7926,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> */
> if (kvm_check_request(KVM_REQ_HV_STIMER, vcpu))
> kvm_hv_process_stimers(vcpu);
> + if (kvm_check_request(KVM_REQ_APICV_ACTIVATE, vcpu))
> + kvm_vcpu_activate_apicv(vcpu);
> + if (kvm_check_request(KVM_REQ_APICV_DEACTIVATE, vcpu))
> + kvm_vcpu_deactivate_apicv(vcpu);
> }
>
> if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {

--
Vitaly

2019-08-27 08:11:18

by Alexander Graf

[permalink] [raw]
Subject: Re: [PATCH v2 06/15] kvm: x86: Add support for activate/de-activate APICv at runtime



On 27.08.19 09:29, Vitaly Kuznetsov wrote:
> "Suthikulpanit, Suravee" <[email protected]> writes:
>
>> Certain runtime conditions require APICv to be temporary deactivated.
>> However, current implementation only support permanently deactivate
>> APICv at runtime (mainly used when running Hyper-V guest).
>>
>> In addtion, for AMD, when activate / deactivate APICv during runtime,
>> all vcpus in the VM has to be operating in the same APICv mode, which
>> requires the requesting (main) vcpu to notify others.
>>
>> So, introduce interfaces to request all vcpus to activate/deactivate
>> APICv.
>>
>> Signed-off-by: Suravee Suthikulpanit <[email protected]>
>> ---
>> arch/x86/include/asm/kvm_host.h | 9 +++++
>> arch/x86/kvm/x86.c | 76 +++++++++++++++++++++++++++++++++++++++++
>> 2 files changed, 85 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index 04d7066..dfb7c3d 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -76,6 +76,10 @@
>> #define KVM_REQ_HV_STIMER KVM_ARCH_REQ(22)
>> #define KVM_REQ_LOAD_EOI_EXITMAP KVM_ARCH_REQ(23)
>> #define KVM_REQ_GET_VMCS12_PAGES KVM_ARCH_REQ(24)
>> +#define KVM_REQ_APICV_ACTIVATE \
>> + KVM_ARCH_REQ_FLAGS(25, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
>> +#define KVM_REQ_APICV_DEACTIVATE \
>> + KVM_ARCH_REQ_FLAGS(26, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
>>
>> #define CR0_RESERVED_BITS \
>> (~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
>> @@ -1089,6 +1093,7 @@ struct kvm_x86_ops {
>> void (*enable_irq_window)(struct kvm_vcpu *vcpu);
>> void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr);
>> bool (*get_enable_apicv)(struct kvm *kvm);
>> + void (*pre_update_apicv_exec_ctrl)(struct kvm_vcpu *vcpu, bool activate);
>> void (*refresh_apicv_exec_ctrl)(struct kvm_vcpu *vcpu);
>> void (*hwapic_irr_update)(struct kvm_vcpu *vcpu, int max_irr);
>> void (*hwapic_isr_update)(struct kvm_vcpu *vcpu, int isr);
>> @@ -1552,6 +1557,10 @@ int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low,
>>
>> void kvm_make_mclock_inprogress_request(struct kvm *kvm);
>> void kvm_make_scan_ioapic_request(struct kvm *kvm);
>> +void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu);
>> +void kvm_vcpu_activate_apicv(struct kvm_vcpu *vcpu);
>> +void kvm_make_apicv_activate_request(struct kvm_vcpu *vcpu);
>> +void kvm_make_apicv_deactivate_request(struct kvm_vcpu *vcpu, bool disable);
>>
>> void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu,
>> struct kvm_async_pf *work);
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index f9c3f63..40a20bf 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -26,6 +26,7 @@
>> #include "cpuid.h"
>> #include "pmu.h"
>> #include "hyperv.h"
>> +#include "lapic.h"
>>
>> #include <linux/clocksource.h>
>> #include <linux/interrupt.h>
>> @@ -7163,6 +7164,22 @@ static void kvm_pv_kick_cpu_op(struct kvm *kvm, unsigned long flags, int apicid)
>> kvm_irq_delivery_to_apic(kvm, NULL, &lapic_irq, NULL);
>> }
>>
>> +void kvm_vcpu_activate_apicv(struct kvm_vcpu *vcpu)
>> +{
>> + if (!lapic_in_kernel(vcpu)) {
>> + WARN_ON_ONCE(!vcpu->arch.apicv_active);
>> + return;
>> + }
>> + if (vcpu->arch.apicv_active)
>> + return;
>> +
>> + vcpu->arch.apicv_active = true;
>> + kvm_apic_update_apicv(vcpu);
>> +
>> + kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_vcpu_activate_apicv);
>> +
>> void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
>> {
>> if (!lapic_in_kernel(vcpu)) {
>> @@ -7173,8 +7190,11 @@ void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
>> return;
>>
>> vcpu->arch.apicv_active = false;
>> + kvm_apic_update_apicv(vcpu);
>> +
>> kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
>> }
>> +EXPORT_SYMBOL_GPL(kvm_vcpu_deactivate_apicv);
>>
>> int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>> {
>> @@ -7668,6 +7688,58 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
>> kvm_make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
>> }
>>
>> +void kvm_make_apicv_activate_request(struct kvm_vcpu *vcpu)
>> +{
>> + int i;
>> + struct kvm_vcpu *v;
>> + struct kvm *kvm = vcpu->kvm;
>> +
>> + mutex_lock(&kvm->arch.apicv_lock);
>> + if (kvm->arch.apicv_state != APICV_DEACTIVATED) {
>> + mutex_unlock(&kvm->arch.apicv_lock);
>> + return;
>> + }
>> +
>> + kvm_for_each_vcpu(i, v, kvm)
>> + kvm_clear_request(KVM_REQ_APICV_DEACTIVATE, v);
>> +
>> + if (kvm_x86_ops->pre_update_apicv_exec_ctrl)
>> + kvm_x86_ops->pre_update_apicv_exec_ctrl(vcpu, true);
>> +
>> + kvm->arch.apicv_state = APICV_ACTIVATED;
>> +
>> + kvm_make_all_cpus_request(kvm, KVM_REQ_APICV_ACTIVATE);
>> +
>> + mutex_unlock(&kvm->arch.apicv_lock);
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_make_apicv_activate_request);
>> +
>> +void kvm_make_apicv_deactivate_request(struct kvm_vcpu *vcpu, bool disable)
>> +{
>> + int i;
>> + struct kvm_vcpu *v;
>> + struct kvm *kvm = vcpu->kvm;
>> +
>> + mutex_lock(&kvm->arch.apicv_lock);
>> + if (kvm->arch.apicv_state == APICV_DEACTIVATED) {
>> + mutex_unlock(&kvm->arch.apicv_lock);
>> + return;
>> + }
>> +
>> + kvm_for_each_vcpu(i, v, kvm)
>> + kvm_clear_request(KVM_REQ_APICV_ACTIVATE, v);
>
> Could you please elaborate on when we need to eat the
> KVM_REQ_APICV_ACTIVATE request here (and KVM_REQ_APICV_DEACTIVATE in
> kvm_make_apicv_activate_request() respectively)? To me, this looks like
> a possible source of hard-to-debug problems in the future.

CPU0 CPU1

kvm_make_apicv_activate_request() ...
Handle KVM_REQ_APICV_ACTIVATE KVM_REQ_APICV_ACTIVATE==1
vcpu_run() ...
kvm_make_apicv_deactivate_request() ...

In that case CPU1 would have both activate and deactivate requests
pending. You can see the same logic above in the activate() function,
just reverse.

I agree that it probably needs at least a comment to explain why we have
it. But when I looked at the code, I could not think of a nicer way to
implement it either.


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879


2019-08-28 15:19:15

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [PATCH v2 11/15] svm: Temporary deactivate AVIC during ExtINT handling

Alex,

On 8/19/19 5:35 AM, Alexander Graf wrote:
>
>
> On 15.08.19 18:25, Suthikulpanit, Suravee wrote:
>> AMD AVIC does not support ExtINT. Therefore, AVIC must be temporary
>> deactivated and fall back to using legacy interrupt injection via vINTR
>> and interrupt window.
>>
>> Signed-off-by: Suravee Suthikulpanit <[email protected]>
>> ---
>>   arch/x86/kvm/svm.c | 49
>> +++++++++++++++++++++++++++++++++++++++++++++----
>>   1 file changed, 45 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>> index cfa4b13..4690351 100644
>> --- a/arch/x86/kvm/svm.c
>> +++ b/arch/x86/kvm/svm.c
>> @@ -384,6 +384,7 @@ struct amd_svm_iommu_ir {
>>   static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
>>   static void svm_flush_tlb(struct kvm_vcpu *vcpu, bool invalidate_gpa);
>>   static void svm_complete_interrupts(struct vcpu_svm *svm);
>> +static void svm_request_activate_avic(struct kvm_vcpu *vcpu);
>>   static bool svm_get_enable_apicv(struct kvm *kvm);
>>   static inline void avic_post_state_restore(struct kvm_vcpu *vcpu);
>> @@ -4494,6 +4495,15 @@ static int interrupt_window_interception(struct
>> vcpu_svm *svm)
>>   {
>>       kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
>>       svm_clear_vintr(svm);
>> +
>> +    /*
>> +     * For AVIC, the only reason to end up here is ExtINTs.
>> +     * In this case AVIC was temporarily disabled for
>> +     * requesting the IRQ window and we have to re-enable it.
>> +     */
>> +    if (svm_get_enable_apicv(svm->vcpu.kvm))
>> +        svm_request_activate_avic(&svm->vcpu);
>
> Would it make sense to add a trace point here and to the other call
> sites, so that it becomes obvious in a trace when and why exactly avic
> was active/inactive?
>
> The trace point could add additional information on the why.

Sure, sounds good.

>> ....
>> @@ -5522,9 +5558,6 @@ static void enable_irq_window(struct kvm_vcpu
>> *vcpu)
>>   {
>>       struct vcpu_svm *svm = to_svm(vcpu);
>> -    if (kvm_vcpu_apicv_active(vcpu))
>> -        return;
>> -
>>       /*
>>        * In case GIF=0 we can't rely on the CPU to tell us when GIF
>> becomes
>>        * 1, because that's a separate STGI/VMRUN intercept.  The next
>> time we
>> @@ -5534,6 +5567,14 @@ static void enable_irq_window(struct kvm_vcpu
>> *vcpu)
>>        * window under the assumption that the hardware will set the GIF.
>>        */
>>       if ((vgif_enabled(svm) || gif_set(svm)) && nested_svm_intr(svm)) {
>> +        /*
>> +         * IRQ window is not needed when AVIC is enabled,
>> +         * unless we have pending ExtINT since it cannot be injected
>> +         * via AVIC. In such case, we need to temporarily disable AVIC,
>> +         * and fallback to injecting IRQ via V_IRQ.
>> +         */
>> +        if (kvm_vcpu_apicv_active(vcpu))
>> +            svm_request_deactivate_avic(&svm->vcpu);
>
> Did you test AVIC with nesting? Did you actually run across this issue
> there?

Currently, we have not claimed that AVIC is supported w/ nested VM.
That's why we have not enabled AVIC by default yet. We will be looking
more into that next.

Suravee

2019-08-28 19:39:37

by Alexander Graf

[permalink] [raw]
Subject: Re: [PATCH v2 11/15] svm: Temporary deactivate AVIC during ExtINT handling



> Am 28.08.2019 um 17:19 schrieb Suthikulpanit, Suravee <[email protected]>:
>
> Alex,
>
>> On 8/19/19 5:35 AM, Alexander Graf wrote:
>>
>>
>>> On 15.08.19 18:25, Suthikulpanit, Suravee wrote:
>>> AMD AVIC does not support ExtINT. Therefore, AVIC must be temporary
>>> deactivated and fall back to using legacy interrupt injection via vINTR
>>> and interrupt window.
>>>
>>> Signed-off-by: Suravee Suthikulpanit <[email protected]>
>>> ---
>>> arch/x86/kvm/svm.c | 49
>>> +++++++++++++++++++++++++++++++++++++++++++++----
>>> 1 file changed, 45 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>>> index cfa4b13..4690351 100644
>>> --- a/arch/x86/kvm/svm.c
>>> +++ b/arch/x86/kvm/svm.c
>>> @@ -384,6 +384,7 @@ struct amd_svm_iommu_ir {
>>> static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
>>> static void svm_flush_tlb(struct kvm_vcpu *vcpu, bool invalidate_gpa);
>>> static void svm_complete_interrupts(struct vcpu_svm *svm);
>>> +static void svm_request_activate_avic(struct kvm_vcpu *vcpu);
>>> static bool svm_get_enable_apicv(struct kvm *kvm);
>>> static inline void avic_post_state_restore(struct kvm_vcpu *vcpu);
>>> @@ -4494,6 +4495,15 @@ static int interrupt_window_interception(struct
>>> vcpu_svm *svm)
>>> {
>>> kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
>>> svm_clear_vintr(svm);
>>> +
>>> + /*
>>> + * For AVIC, the only reason to end up here is ExtINTs.
>>> + * In this case AVIC was temporarily disabled for
>>> + * requesting the IRQ window and we have to re-enable it.
>>> + */
>>> + if (svm_get_enable_apicv(svm->vcpu.kvm))
>>> + svm_request_activate_avic(&svm->vcpu);
>>
>> Would it make sense to add a trace point here and to the other call
>> sites, so that it becomes obvious in a trace when and why exactly avic
>> was active/inactive?
>>
>> The trace point could add additional information on the why.
>
> Sure, sounds good.
>
>>> ....
>>> @@ -5522,9 +5558,6 @@ static void enable_irq_window(struct kvm_vcpu
>>> *vcpu)
>>> {
>>> struct vcpu_svm *svm = to_svm(vcpu);
>>> - if (kvm_vcpu_apicv_active(vcpu))
>>> - return;
>>> -
>>> /*
>>> * In case GIF=0 we can't rely on the CPU to tell us when GIF
>>> becomes
>>> * 1, because that's a separate STGI/VMRUN intercept. The next
>>> time we
>>> @@ -5534,6 +5567,14 @@ static void enable_irq_window(struct kvm_vcpu
>>> *vcpu)
>>> * window under the assumption that the hardware will set the GIF.
>>> */
>>> if ((vgif_enabled(svm) || gif_set(svm)) && nested_svm_intr(svm)) {
>>> + /*
>>> + * IRQ window is not needed when AVIC is enabled,
>>> + * unless we have pending ExtINT since it cannot be injected
>>> + * via AVIC. In such case, we need to temporarily disable AVIC,
>>> + * and fallback to injecting IRQ via V_IRQ.
>>> + */
>>> + if (kvm_vcpu_apicv_active(vcpu))
>>> + svm_request_deactivate_avic(&svm->vcpu);
>>
>> Did you test AVIC with nesting? Did you actually run across this issue
>> there?
>
> Currently, we have not claimed that AVIC is supported w/ nested VM.
> That's why we have not enabled AVIC by default yet. We will be looking
> more into that next.

If it's not supported, please suspend it when we enter a nested guest then? In that case, the above change is also unnecessary, as it only affects nested guests, no?

Alex

>
> Suravee



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



2019-09-10 19:11:48

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [PATCH v2 11/15] svm: Temporary deactivate AVIC during ExtINT handling

Alex,

On 8/28/19 2:37 PM, Graf (AWS), Alexander wrote:
>>>> @@ -5522,9 +5558,6 @@ static void enable_irq_window(struct kvm_vcpu
>>>> *vcpu)
>>>> {
>>>> struct vcpu_svm *svm = to_svm(vcpu);
>>>> - if (kvm_vcpu_apicv_active(vcpu))
>>>> - return;
>>>> -
>>>> /*
>>>> * In case GIF=0 we can't rely on the CPU to tell us when GIF
>>>> becomes
>>>> * 1, because that's a separate STGI/VMRUN intercept. The next
>>>> time we
>>>> @@ -5534,6 +5567,14 @@ static void enable_irq_window(struct kvm_vcpu
>>>> *vcpu)
>>>> * window under the assumption that the hardware will set the GIF.
>>>> */
>>>> if ((vgif_enabled(svm) || gif_set(svm)) && nested_svm_intr(svm)) {
>>>> + /*
>>>> + * IRQ window is not needed when AVIC is enabled,
>>>> + * unless we have pending ExtINT since it cannot be injected
>>>> + * via AVIC. In such case, we need to temporarily disable AVIC,
>>>> + * and fallback to injecting IRQ via V_IRQ.
>>>> + */
>>>> + if (kvm_vcpu_apicv_active(vcpu))
>>>> + svm_request_deactivate_avic(&svm->vcpu);
>>> Did you test AVIC with nesting? Did you actually run across this issue
>>> there?
>> Currently, we have not claimed that AVIC is supported w/ nested VM.
>> That's why we have not enabled AVIC by default yet. We will be looking
>> more into that next.
> If it's not supported, please suspend it when we enter a nested guest then?

Ok, this makes sense. I'll update this in V3.

> In that case, the above change is also unnecessary, as it only affects nested guests, no?

Actually, the function name nested_svm_intr() is misleading. Here it
returns true when _NOT_ in guest mode:

/* This function returns true if it is save to enable the irq window */
static inline bool nested_svm_intr(struct vcpu_svm *svm)
{
if (!is_guest_mode(&svm->vcpu))
return true;
....

So, the logic above does what we want when AVIC is enabled.

Thanks,
Suravee