Currently, AMD AVIC can only be enabled when creating VM in irqchip
split mode, which is due to the following issues:
* AMD AVIC does not support ExtINT, which is required during booting
phase of Windows and FreeBSD VMs. This results in hanging in the
boot loaders.
* Untrap APIC EOI write for edge-trigger interrpt.
Please see 67034bb9dd5e ('KVM: SVM: Add irqchip_split() checks
before enabling AVIC') for more information.
This restriction makes AVIC difficult to enable since it depends on
the non-default QEMU's kernel_irqchip=split option. Therefore, this RFC
series tries to provide workaround for the issues above, and allow AVIC to
work with the option kernel_irqchip=on (QEMU default).
Patch 1 fixes AVIC deactivation.
Patch 2-6 introduces run-time AVIC acticate/deactivate support.
Patch 7 provide workaround for untrap APIC EOI.
Patch 8 removes the kernel irqchip split restriction.
This series has been tested booting Linux, Windows, and FreeBSD VMs
with kernel_irqchip=off/split/on.
Thanks,
Suravee
Julian Stecklina (1):
KVM: i8254: Remove need for irq ack notifier
Suravee Suthikulpanit (7):
svm: Fix improper check when deactivate AVIC
KVM: x86: Add interface for run-time activate/de-activate APIC
virtualization
KVM: x86: Add callback functions for handling APIC ID, DFR and LDR
update
svm: Add AMD AVIC handlers for APIC ID, DFR and LDR update
svm: Add support for APIC_ACCESS_PAGE_PRIVATE_MEMSLOT setup/destroy
svm: Temporary deactivate AVIC during ExtINT handling
svm: Allow AVIC with in-kernel irqchip mode
arch/x86/include/asm/kvm_host.h | 11 +++
arch/x86/kvm/i8254.c | 22 +-----
arch/x86/kvm/i8254.h | 2 -
arch/x86/kvm/lapic.c | 11 ++-
arch/x86/kvm/svm.c | 118 +++++++++++++++++++++++++++++---
arch/x86/kvm/x86.c | 36 ++++++++++
6 files changed, 164 insertions(+), 36 deletions(-)
--
2.17.1
The function svm_refresh_apicv_exec_ctrl() always returning prematurely
as kvm_vcpu_apicv_active() always return false when calling from
the function arch/x86/kvm/x86.c:kvm_vcpu_deactivate_apicv().
This is because the apicv_active is set to false just before calling
refresh_apicv_exec_ctrl().
Also, we need to mark VMCB_AVIC bit as dirty instead of VMCB_INTR.
So, fix svm_refresh_apicv_exec_ctrl() to properly deactivate AVIC.
Fixes: 67034bb9dd5e ('KVM: SVM: Add irqchip_split() checks before enabling AVIC')
Cc: Radim Krčmář <[email protected]>
Suggested-by: Paolo Bonzini <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index d35c9002f282..75508f665aef 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5114,11 +5114,11 @@ static void svm_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
struct vcpu_svm *svm = to_svm(vcpu);
struct vmcb *vmcb = svm->vmcb;
- if (!kvm_vcpu_apicv_active(&svm->vcpu))
- return;
-
- vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
- mark_dirty(vmcb, VMCB_INTR);
+ if (kvm_vcpu_apicv_active(vcpu))
+ vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
+ else
+ vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
+ mark_dirty(vmcb, VMCB_AVIC);
}
static void svm_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
--
2.17.1
Add hooks for handling the case when guest VM update APIC ID, DFR and LDR.
This is needed during when AMD AVIC is temporary deactivated.
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 3 +++
arch/x86/kvm/lapic.c | 11 +++++++++--
2 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 05b5778c769e..cad47b4955e2 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1083,6 +1083,9 @@ struct kvm_x86_ops {
void (*refresh_apicv_exec_ctrl)(struct kvm_vcpu *vcpu);
void (*hwapic_irr_update)(struct kvm_vcpu *vcpu, int max_irr);
void (*hwapic_isr_update)(struct kvm_vcpu *vcpu, int isr);
+ void (*hwapic_apic_id_update)(struct kvm_vcpu *vcpu);
+ void (*hwapic_dfr_update)(struct kvm_vcpu *vcpu);
+ void (*hwapic_ldr_update)(struct kvm_vcpu *vcpu);
bool (*guest_apic_has_interrupt)(struct kvm_vcpu *vcpu);
void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
void (*set_virtual_apic_mode)(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 9f089e2e09d0..a067a7292a5f 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -267,6 +267,8 @@ static inline void kvm_apic_set_xapic_id(struct kvm_lapic *apic, u8 id)
static inline void kvm_apic_set_ldr(struct kvm_lapic *apic, u32 id)
{
kvm_lapic_set_reg(apic, APIC_LDR, id);
+ if (kvm_x86_ops->hwapic_ldr_update)
+ kvm_x86_ops->hwapic_ldr_update(apic->vcpu);
recalculate_apic_map(apic->vcpu->kvm);
}
@@ -1809,10 +1811,13 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
switch (reg) {
case APIC_ID: /* Local APIC ID */
- if (!apic_x2apic_mode(apic))
+ if (!apic_x2apic_mode(apic)) {
kvm_apic_set_xapic_id(apic, val >> 24);
- else
+ if (kvm_x86_ops->hwapic_apic_id_update)
+ kvm_x86_ops->hwapic_apic_id_update(apic->vcpu);
+ } else {
ret = 1;
+ }
break;
case APIC_TASKPRI:
@@ -1834,6 +1839,8 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
case APIC_DFR:
if (!apic_x2apic_mode(apic)) {
kvm_lapic_set_reg(apic, APIC_DFR, val | 0x0FFFFFFF);
+ if (kvm_x86_ops->hwapic_dfr_update)
+ kvm_x86_ops->hwapic_dfr_update(apic->vcpu);
recalculate_apic_map(apic->vcpu->kvm);
} else
ret = 1;
--
2.17.1
During AVIC temporary deactivation, guest could update APIC ID,
DFR and LDR registers, which would not be trapped by
avic_unaccelerated_ccess_interception(). In this case, we need
to update the AVIC logical APIC ID table accordingly.
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 75508f665aef..c33f02b4a19e 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -4649,6 +4649,24 @@ static void avic_handle_dfr_update(struct kvm_vcpu *vcpu)
svm->dfr_reg = dfr;
}
+static void svm_hwapic_ldr_update(struct kvm_vcpu *vcpu)
+{
+ if (svm_get_enable_apicv(vcpu) && !kvm_vcpu_apicv_active(vcpu))
+ avic_handle_ldr_update(vcpu);
+}
+
+static void svm_hwapic_apic_id_update(struct kvm_vcpu *vcpu)
+{
+ if (svm_get_enable_apicv(vcpu) && !kvm_vcpu_apicv_active(vcpu))
+ avic_handle_apic_id_update(vcpu);
+}
+
+static void svm_hwapic_dfr_update(struct kvm_vcpu *vcpu)
+{
+ if (svm_get_enable_apicv(vcpu) && !kvm_vcpu_apicv_active(vcpu))
+ avic_handle_dfr_update(vcpu);
+}
+
static int avic_unaccel_trap_write(struct vcpu_svm *svm)
{
struct kvm_lapic *apic = svm->vcpu.arch.apic;
@@ -7166,6 +7184,9 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
.load_eoi_exitmap = svm_load_eoi_exitmap,
.hwapic_irr_update = svm_hwapic_irr_update,
.hwapic_isr_update = svm_hwapic_isr_update,
+ .hwapic_apic_id_update = svm_hwapic_apic_id_update,
+ .hwapic_dfr_update = svm_hwapic_dfr_update,
+ .hwapic_ldr_update = svm_hwapic_ldr_update,
.sync_pir_to_irr = kvm_lapic_find_highest_irr,
.apicv_post_state_restore = avic_post_state_restore,
--
2.17.1
Activate/deactivate AVIC requires setting/unsetting the memory region used
for APIC_ACCESS_PAGE_PRIVATE_MEMSLOT. So, re-factor avic_init_access_page()
to avic_setup_access_page() and add srcu_read_lock/unlock, which are needed
to allow this function to be called during run-time.
Also, introduce avic_destroy_access_page() to unset the page when
deactivate AVIC.
Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm.c | 28 ++++++++++++++++++++++++++--
1 file changed, 26 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index c33f02b4a19e..a23b1e77acb6 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1658,7 +1658,7 @@ static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu,
* field of the VMCB. Therefore, we set up the
* APIC_ACCESS_PAGE_PRIVATE_MEMSLOT (4KB) here.
*/
-static int avic_init_access_page(struct kvm_vcpu *vcpu)
+static int avic_setup_access_page(struct kvm_vcpu *vcpu, bool init)
{
struct kvm *kvm = vcpu->kvm;
int ret = 0;
@@ -1667,10 +1667,14 @@ static int avic_init_access_page(struct kvm_vcpu *vcpu)
if (kvm->arch.apic_access_page_done)
goto out;
+ if (!init)
+ srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
ret = __x86_set_memory_region(kvm,
APIC_ACCESS_PAGE_PRIVATE_MEMSLOT,
APIC_DEFAULT_PHYS_BASE,
PAGE_SIZE);
+ if (!init)
+ vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
if (ret)
goto out;
@@ -1680,6 +1684,26 @@ static int avic_init_access_page(struct kvm_vcpu *vcpu)
return ret;
}
+static void avic_destroy_access_page(struct kvm_vcpu *vcpu)
+{
+ struct kvm *kvm = vcpu->kvm;
+
+ mutex_lock(&kvm->slots_lock);
+
+ if (!kvm->arch.apic_access_page_done)
+ goto out;
+
+ srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
+ __x86_set_memory_region(kvm,
+ APIC_ACCESS_PAGE_PRIVATE_MEMSLOT,
+ APIC_DEFAULT_PHYS_BASE,
+ 0);
+ vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
+ kvm->arch.apic_access_page_done = false;
+out:
+ mutex_unlock(&kvm->slots_lock);
+}
+
static int avic_init_backing_page(struct kvm_vcpu *vcpu)
{
int ret;
@@ -1687,7 +1711,7 @@ static int avic_init_backing_page(struct kvm_vcpu *vcpu)
int id = vcpu->vcpu_id;
struct vcpu_svm *svm = to_svm(vcpu);
- ret = avic_init_access_page(vcpu);
+ ret = avic_setup_access_page(vcpu, true);
if (ret)
return ret;
--
2.17.1
AMD AVIC does not support ExtINT. Therefore, AVIC must be temporary
deactivated and fall back to using legacy interrupt injection via
vINTR and interrupt window.
Introduce svm_request_activate/deactivate_avic() helper functions,
which handle steps required to activate/deactivate AVIC.
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm.c | 57 +++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 54 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index a23b1e77acb6..580ab40ba207 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -383,6 +383,8 @@ static u8 rsm_ins_bytes[] = "\x0f\xaa";
static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
static void svm_flush_tlb(struct kvm_vcpu *vcpu, bool invalidate_gpa);
static void svm_complete_interrupts(struct vcpu_svm *svm);
+static void svm_request_activate_avic(struct kvm_vcpu *vcpu);
+static bool svm_get_enable_apicv(struct kvm_vcpu *vcpu);
static int nested_svm_exit_handled(struct vcpu_svm *svm);
static int nested_svm_intercept(struct vcpu_svm *svm);
@@ -2085,6 +2087,9 @@ static void avic_set_running(struct kvm_vcpu *vcpu, bool is_run)
{
struct vcpu_svm *svm = to_svm(vcpu);
+ if (!kvm_vcpu_apicv_active(vcpu))
+ return;
+
svm->avic_is_running = is_run;
if (is_run)
avic_vcpu_load(vcpu, vcpu->cpu);
@@ -2319,6 +2324,10 @@ static void svm_vcpu_blocking(struct kvm_vcpu *vcpu)
static void svm_vcpu_unblocking(struct kvm_vcpu *vcpu)
{
+ if (kvm_check_request(KVM_REQ_APICV_ACTIVATE, vcpu))
+ kvm_vcpu_activate_apicv(vcpu);
+ if (kvm_check_request(KVM_REQ_APICV_DEACTIVATE, vcpu))
+ kvm_vcpu_deactivate_apicv(vcpu);
avic_set_running(vcpu, true);
}
@@ -4460,6 +4469,15 @@ static int interrupt_window_interception(struct vcpu_svm *svm)
{
kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
svm_clear_vintr(svm);
+
+ /*
+ * For AVIC, the only reason to end up here is ExtINTs.
+ * In this case AVIC was temporarily disabled for
+ * requesting the IRQ window and we have to re-enable it.
+ */
+ if (svm_get_enable_apicv(&svm->vcpu))
+ svm_request_activate_avic(&svm->vcpu);
+
svm->vmcb->control.int_ctl &= ~V_IRQ_MASK;
mark_dirty(svm->vmcb, VMCB_INTR);
++svm->vcpu.stat.irq_window_exits;
@@ -5150,6 +5168,34 @@ static void svm_hwapic_isr_update(struct kvm_vcpu *vcpu, int max_isr)
{
}
+static bool is_avic_active(struct vcpu_svm *svm)
+{
+ return (svm_get_enable_apicv(&svm->vcpu) &&
+ svm->vmcb->control.int_ctl & AVIC_ENABLE_MASK);
+}
+
+static void svm_request_activate_avic(struct kvm_vcpu *vcpu)
+{
+ struct vcpu_svm *svm = to_svm(vcpu);
+
+ if (!lapic_in_kernel(vcpu) || is_avic_active(svm))
+ return;
+
+ avic_setup_access_page(vcpu, false);
+ kvm_make_apicv_activate_request(vcpu->kvm);
+}
+
+static void svm_request_deactivate_avic(struct kvm_vcpu *vcpu)
+{
+ struct vcpu_svm *svm = to_svm(vcpu);
+
+ if (!lapic_in_kernel(vcpu) || !is_avic_active(svm))
+ return;
+
+ avic_destroy_access_page(vcpu);
+ kvm_make_apicv_deactivate_request(vcpu->kvm);
+}
+
/* Note: Currently only used by Hyper-V. */
static void svm_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
{
@@ -5437,9 +5483,6 @@ static void enable_irq_window(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
- if (kvm_vcpu_apicv_active(vcpu))
- return;
-
/*
* In case GIF=0 we can't rely on the CPU to tell us when GIF becomes
* 1, because that's a separate STGI/VMRUN intercept. The next time we
@@ -5449,6 +5492,14 @@ static void enable_irq_window(struct kvm_vcpu *vcpu)
* window under the assumption that the hardware will set the GIF.
*/
if ((vgif_enabled(svm) || gif_set(svm)) && nested_svm_intr(svm)) {
+ /*
+ * IRQ window is not needed when AVIC is enabled,
+ * unless we have pending ExtINT since it cannot be injected
+ * via AVIC. In such case, we need to temporarily disable AVIC,
+ * and fallback to injecting IRQ via V_IRQ.
+ */
+ if (kvm_vcpu_apicv_active(vcpu))
+ svm_request_deactivate_avic(&svm->vcpu);
svm_set_vintr(svm);
svm_inject_irq(svm, 0x0);
}
--
2.17.1
From: Julian Stecklina <[email protected]>
ACK notifiers don't work with AMD AVIC when the PIT interrupt is
delivered as edge-triggered fixed interrupt via the IOAPIC. AMD
processors cannot exit on EOI for these interrupts. The ACK notifiers do
work when the interrupt is delivered via PIC as ExtINT, because the ACK
comes as PIO write that KVM sees.
Change the PIT code to not rely on the ACK notifiers. The IRQ ACK
notifier in the PIT emulation re-schedules pit->expired to reinject any
pending PIT interrupt. This seems useless, because we can pulse the PIT
interrupt even when the interrupt is not ACKed yet. This means any timer
expiry when the interrupt was being handled by the guest, will cause an
interrupt to be injected automatically when the interrupt is ACKed.
Reviewed-by: Filippo Sironi <[email protected]>
Signed-off-by: Julian Stecklina <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/i8254.c | 22 +---------------------
arch/x86/kvm/i8254.h | 2 --
2 files changed, 1 insertion(+), 23 deletions(-)
diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
index af192895b1fc..a8f6eb0ac1a0 100644
--- a/arch/x86/kvm/i8254.c
+++ b/arch/x86/kvm/i8254.c
@@ -200,21 +200,6 @@ static inline struct kvm_pit *pit_state_to_pit(struct kvm_kpit_state *ps)
return container_of(ps, struct kvm_pit, pit_state);
}
-static void kvm_pit_ack_irq(struct kvm_irq_ack_notifier *kian)
-{
- struct kvm_kpit_state *ps = container_of(kian, struct kvm_kpit_state,
- irq_ack_notifier);
- struct kvm_pit *pit = pit_state_to_pit(ps);
-
- atomic_set(&ps->irq_ack, 1);
- /* irq_ack should be set before pending is read. Order accesses with
- * inc(pending) in pit_timer_fn and xchg(irq_ack, 0) in pit_do_work.
- */
- smp_mb();
- if (atomic_dec_if_positive(&ps->pending) > 0)
- kthread_queue_work(pit->worker, &pit->expired);
-}
-
void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu)
{
struct kvm_pit *pit = vcpu->kvm->arch.vpit;
@@ -244,7 +229,7 @@ static void pit_do_work(struct kthread_work *work)
int i;
struct kvm_kpit_state *ps = &pit->pit_state;
- if (atomic_read(&ps->reinject) && !atomic_xchg(&ps->irq_ack, 0))
+ if (atomic_read(&ps->reinject) && !atomic_xchg(&ps->pending, 0))
return;
kvm_set_irq(kvm, pit->irq_source_id, 0, 1, false);
@@ -284,7 +269,6 @@ static enum hrtimer_restart pit_timer_fn(struct hrtimer *data)
static inline void kvm_pit_reset_reinject(struct kvm_pit *pit)
{
atomic_set(&pit->pit_state.pending, 0);
- atomic_set(&pit->pit_state.irq_ack, 1);
}
void kvm_pit_set_reinject(struct kvm_pit *pit, bool reinject)
@@ -298,10 +282,8 @@ void kvm_pit_set_reinject(struct kvm_pit *pit, bool reinject)
if (reinject) {
/* The initial state is preserved while ps->reinject == 0. */
kvm_pit_reset_reinject(pit);
- kvm_register_irq_ack_notifier(kvm, &ps->irq_ack_notifier);
kvm_register_irq_mask_notifier(kvm, 0, &pit->mask_notifier);
} else {
- kvm_unregister_irq_ack_notifier(kvm, &ps->irq_ack_notifier);
kvm_unregister_irq_mask_notifier(kvm, 0, &pit->mask_notifier);
}
@@ -679,8 +661,6 @@ struct kvm_pit *kvm_create_pit(struct kvm *kvm, u32 flags)
hrtimer_init(&pit_state->timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
pit_state->timer.function = pit_timer_fn;
- pit_state->irq_ack_notifier.gsi = 0;
- pit_state->irq_ack_notifier.irq_acked = kvm_pit_ack_irq;
pit->mask_notifier.func = pit_mask_notifer;
kvm_pit_reset(pit);
diff --git a/arch/x86/kvm/i8254.h b/arch/x86/kvm/i8254.h
index 394d9527da7e..4a9cfe00306f 100644
--- a/arch/x86/kvm/i8254.h
+++ b/arch/x86/kvm/i8254.h
@@ -34,8 +34,6 @@ struct kvm_kpit_state {
struct mutex lock;
atomic_t reinject;
atomic_t pending; /* accumulated triggered timers */
- atomic_t irq_ack;
- struct kvm_irq_ack_notifier irq_ack_notifier;
};
struct kvm_pit {
--
2.17.1
Once the IRQ ack notifier for in-kernel PIT is no longer required
and run-time AVIC activate/deactivate is supported, we can remove
the kernel irqchip split mode requirement for AVIC.
Hence, remove the check for irqchip split mode when enabling AVIC.
Cc: Radim Krčmář <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 580ab40ba207..24dfa6a93711 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5157,7 +5157,7 @@ static void svm_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
static bool svm_get_enable_apicv(struct kvm_vcpu *vcpu)
{
- return avic && irqchip_split(vcpu->kvm);
+ return avic;
}
static void svm_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr)
--
2.17.1
Certain types of interrupt cannot be supported by AMD AVIC hardware.
Therefore, there is a need to temporary deactivate AVIC and fallback to
legacy interrupt injection mechanism.
Since AMD AVIC requires all vcpus to be operating in the same mode.
So, introduce new interface to request all vCPUs to activate/deactivate
APICV during run-time.
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 8 ++++++++
arch/x86/kvm/x86.c | 36 +++++++++++++++++++++++++++++++++
2 files changed, 44 insertions(+)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4660ce90de7f..05b5778c769e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -78,6 +78,10 @@
#define KVM_REQ_HV_STIMER KVM_ARCH_REQ(22)
#define KVM_REQ_LOAD_EOI_EXITMAP KVM_ARCH_REQ(23)
#define KVM_REQ_GET_VMCS12_PAGES KVM_ARCH_REQ(24)
+#define KVM_REQ_APICV_ACTIVATE \
+ KVM_ARCH_REQ_FLAGS(25, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
+#define KVM_REQ_APICV_DEACTIVATE \
+ KVM_ARCH_REQ_FLAGS(26, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
#define CR0_RESERVED_BITS \
(~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
@@ -1536,6 +1540,10 @@ bool kvm_is_linear_rip(struct kvm_vcpu *vcpu, unsigned long linear_rip);
void kvm_make_mclock_inprogress_request(struct kvm *kvm);
void kvm_make_scan_ioapic_request(struct kvm *kvm);
+void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu);
+void kvm_vcpu_activate_apicv(struct kvm_vcpu *vcpu);
+void kvm_make_apicv_activate_request(struct kvm *kvm);
+void kvm_make_apicv_deactivate_request(struct kvm *kvm);
void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu,
struct kvm_async_pf *work);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 02c8e095a239..e93e2ef923b4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7045,11 +7045,19 @@ static void kvm_pv_kick_cpu_op(struct kvm *kvm, unsigned long flags, int apicid)
kvm_irq_delivery_to_apic(kvm, NULL, &lapic_irq, NULL);
}
+void kvm_vcpu_activate_apicv(struct kvm_vcpu *vcpu)
+{
+ vcpu->arch.apicv_active = true;
+ kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
+}
+EXPORT_SYMBOL_GPL(kvm_vcpu_activate_apicv);
+
void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
{
vcpu->arch.apicv_active = false;
kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
}
+EXPORT_SYMBOL_GPL(kvm_vcpu_deactivate_apicv);
int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
{
@@ -7541,6 +7549,30 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
kvm_make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
}
+void kvm_make_apicv_activate_request(struct kvm *kvm)
+{
+ int i;
+ struct kvm_vcpu *v;
+
+ kvm_for_each_vcpu(i, v, kvm)
+ kvm_clear_request(KVM_REQ_APICV_DEACTIVATE, v);
+
+ kvm_make_all_cpus_request(kvm, KVM_REQ_APICV_ACTIVATE);
+}
+EXPORT_SYMBOL_GPL(kvm_make_apicv_activate_request);
+
+void kvm_make_apicv_deactivate_request(struct kvm *kvm)
+{
+ int i;
+ struct kvm_vcpu *v;
+
+ kvm_for_each_vcpu(i, v, kvm)
+ kvm_clear_request(KVM_REQ_APICV_ACTIVATE, v);
+
+ kvm_make_all_cpus_request(kvm, KVM_REQ_APICV_DEACTIVATE);
+}
+EXPORT_SYMBOL_GPL(kvm_make_apicv_deactivate_request);
+
static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
{
if (!kvm_apic_present(vcpu))
@@ -7727,6 +7759,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
*/
if (kvm_check_request(KVM_REQ_HV_STIMER, vcpu))
kvm_hv_process_stimers(vcpu);
+ if (kvm_check_request(KVM_REQ_APICV_ACTIVATE, vcpu))
+ kvm_vcpu_activate_apicv(vcpu);
+ if (kvm_check_request(KVM_REQ_APICV_DEACTIVATE, vcpu))
+ kvm_vcpu_deactivate_apicv(vcpu);
}
if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
--
2.17.1
On Mon, 4 Feb 2019 14:42:32 +0000
"Suthikulpanit, Suravee" <[email protected]> wrote:
> Once the IRQ ack notifier for in-kernel PIT is no longer required
> and run-time AVIC activate/deactivate is supported, we can remove
> the kernel irqchip split mode requirement for AVIC.
>
> Hence, remove the check for irqchip split mode when enabling AVIC.
Yay! Could we also at this point make avic enabled by default or are
there remaining incompatibilities? Thanks,
Alex
> Cc: Radim Krčmář <[email protected]>
> Cc: Paolo Bonzini <[email protected]>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/kvm/svm.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 580ab40ba207..24dfa6a93711 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -5157,7 +5157,7 @@ static void svm_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
>
> static bool svm_get_enable_apicv(struct kvm_vcpu *vcpu)
> {
> - return avic && irqchip_split(vcpu->kvm);
> + return avic;
> }
>
> static void svm_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr)
On 04/02/19 15:42, Suthikulpanit, Suravee wrote:
> From: Julian Stecklina <[email protected]>
>
> ACK notifiers don't work with AMD AVIC when the PIT interrupt is
> delivered as edge-triggered fixed interrupt via the IOAPIC. AMD
> processors cannot exit on EOI for these interrupts. The ACK notifiers do
> work when the interrupt is delivered via PIC as ExtINT, because the ACK
> comes as PIO write that KVM sees.
>
> Change the PIT code to not rely on the ACK notifiers. The IRQ ACK
> notifier in the PIT emulation re-schedules pit->expired to reinject any
> pending PIT interrupt. This seems useless, because we can pulse the PIT
> interrupt even when the interrupt is not ACKed yet. This means any timer
> expiry when the interrupt was being handled by the guest, will cause an
> interrupt to be injected automatically when the interrupt is ACKed.
The difference is that with the irq ack notifier *all* the pending PIT
interrupts will be delivered. Without, only one will. If you don't
want the ack notifier, you can do that with the KVM_REINJECT_CONTROL
ioctl.
Unfortunately, reinject is enabled by default and that is part of the
userspace ABI and we cannot change it.
It's possible to enable AVIC when reinject is disabled, and we fix that
at the QEMU level by changing the default reinject behavior, basically
with this one-line change:
diff --git a/hw/i386/kvm/i8254.c b/hw/i386/kvm/i8254.c
index d4d4a85..1d82860 100644
--- a/hw/i386/kvm/i8254.c
+++ b/hw/i386/kvm/i8254.c
@@ -305,7 +305,7 @@ static void kvm_pit_realizefn(DeviceState *dev, Error **errp)
static Property kvm_pit_properties[] = {
DEFINE_PROP_UINT32("iobase", PITCommonState, iobase, -1),
DEFINE_PROP_LOSTTICKPOLICY("lost_tick_policy", KVMPITState,
- lost_tick_policy, LOST_TICK_POLICY_DELAY),
+ lost_tick_policy, LOST_TICK_POLICY_DISCARD),
DEFINE_PROP_END_OF_LIST(),
};
plus the corresponding change to compat properties in hw/i386/pc.c.
Alternatively, it is probably a good time to switch the default to split irqchip
in QEMU. Split irqchip was introduced in kernel 4.5, which was released about
three years ago.
Paolo
> Reviewed-by: Filippo Sironi <[email protected]>
> Signed-off-by: Julian Stecklina <[email protected]>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
Alex,
On 2/6/19 1:34 AM, Alex Williamson wrote:
> On Mon, 4 Feb 2019 14:42:32 +0000
> "Suthikulpanit, Suravee"<[email protected]> wrote:
>
>> Once the IRQ ack notifier for in-kernel PIT is no longer required
>> and run-time AVIC activate/deactivate is supported, we can remove
>> the kernel irqchip split mode requirement for AVIC.
>>
>> Hence, remove the check for irqchip split mode when enabling AVIC.
> Yay! Could we also at this point make avic enabled by default or are
> there remaining incompatibilities? Thanks,
I'm looking into that next. I would need to ensure that enabling
AVIC would not cause issues with other features.
Suravee
Paolo Bonzini <[email protected]> writes:
> Alternatively, it is probably a good time to switch the default to split irqchip
> in QEMU. Split irqchip was introduced in kernel 4.5, which was released about
> three years ago.
I totally agree. At some point, the in-kernel PIT/PIC emulation should
also be removed, but I guess it's too early to talk about this.
Julian
On Wed, 2019-02-06 at 11:20 +0000, Suthikulpanit, Suravee wrote:
> Alex,
>
> On 2/6/19 1:34 AM, Alex Williamson wrote:
> > On Mon, 4 Feb 2019 14:42:32 +0000
> > "Suthikulpanit, Suravee"<[email protected]> wrote:
> >
> > > Once the IRQ ack notifier for in-kernel PIT is no longer required
> > > and run-time AVIC activate/deactivate is supported, we can remove
> > > the kernel irqchip split mode requirement for AVIC.
> > >
> > > Hence, remove the check for irqchip split mode when enabling AVIC.
> >
> > Yay! Could we also at this point make avic enabled by default or are
> > there remaining incompatibilities? Thanks,
>
> I'm looking into that next. I would need to ensure that enabling
> AVIC would not cause issues with other features.
>
> Suravee
Hi!
Do you have any update on the state of this patch?
I kind of stumbled on it accidently, while
trying to understand why AVIC is only enabled in the split irqchip mode.
Best regards,
Maxim Levitsky
Hi,
On 6/15/19 9:28 AM, Maxim Levitsky wrote:
> On Wed, 2019-02-06 at 11:20 +0000, Suthikulpanit, Suravee wrote:
>> Alex,
>>
>> On 2/6/19 1:34 AM, Alex Williamson wrote:
>>> On Mon, 4 Feb 2019 14:42:32 +0000
>>> "Suthikulpanit, Suravee"<[email protected]> wrote:
>>>
>>>> Once the IRQ ack notifier for in-kernel PIT is no longer required
>>>> and run-time AVIC activate/deactivate is supported, we can remove
>>>> the kernel irqchip split mode requirement for AVIC.
>>>>
>>>> Hence, remove the check for irqchip split mode when enabling AVIC.
>>>
>>> Yay! Could we also at this point make avic enabled by default or are
>>> there remaining incompatibilities? Thanks,
>>
>> I'm looking into that next. I would need to ensure that enabling
>> AVIC would not cause issues with other features.
>>
>> Suravee
>
> Hi!
>
> Do you have any update on the state of this patch?
> I kind of stumbled on it accidently, while
> trying to understand why AVIC is only enabled in the split irqchip mode.
I'm still working on this and testing the series.
I'll post this soon.
Thanks,
Suravee
> Best regards,
> Maxim Levitsky
>