Drop the WARN in KVM_RUN that asserts that KVM isn't using the hypervisor
timer, a.k.a. the VMX preemption timer, for a vCPU that is in the
UNINITIALIZIED activity state. The intent of the WARN is to sanity check
that KVM won't drop a timer interrupt due to an unexpected transition to
UNINITIALIZED, but unfortunately userspace can use various ioctl()s to
force the unexpected state.
Drop the sanity check instead of switching from the hypervisor timer to a
software based timer, as the only reason to switch to a software timer
when a vCPU is blocking is to ensure the timer interrupt wakes the vCPU,
but said interrupt isn't a valid wake event for vCPUs in UNINITIALIZED
state *and* the interrupt will be dropped in the end.
Reported-by: Yikebaer Aizezi <[email protected]>
Closes: https://lore.kernel.org/all/CALcu4rbFrU4go8sBHk3FreP+qjgtZCGcYNpSiEXOLm==qFv7iQ@mail.gmail.com
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/x86.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a6b9bea62fb8..fa7eeb45b8e3 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11091,12 +11091,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
r = -EINTR;
goto out;
}
+
/*
- * It should be impossible for the hypervisor timer to be in
- * use before KVM has ever run the vCPU.
+ * Don't bother switching APIC timer emulation from the
+ * hypervisor timer to the software timer, the only way for the
+ * APIC timer to be active is if userspace stuffed vCPU state,
+ * i.e. put the vCPU into a nonsensical state. Only an INIT
+ * will transition the vCPU out of UNINITIALIZED (without more
+ * state stuffing from userspace), which will reset the local
+ * APIC and thus smother the timer anyways, i.e. the APIC timer
+ * IRQ(s) will be dropped no matter what.
*/
- WARN_ON_ONCE(kvm_lapic_hv_timer_in_use(vcpu));
-
kvm_vcpu_srcu_read_unlock(vcpu);
kvm_vcpu_block(vcpu);
kvm_vcpu_srcu_read_lock(vcpu);
base-commit: fdf0eaf11452d72945af31804e2a1048ee1b574c
--
2.41.0.640.ga95def55d0-goog
On 8/9/23 01:20, Sean Christopherson wrote:
> /*
> - * It should be impossible for the hypervisor timer to be in
> - * use before KVM has ever run the vCPU.
> + * Don't bother switching APIC timer emulation from the
> + * hypervisor timer to the software timer, the only way for the
> + * APIC timer to be active is if userspace stuffed vCPU state,
> + * i.e. put the vCPU into a nonsensical state. Only an INIT
> + * will transition the vCPU out of UNINITIALIZED (without more
> + * state stuffing from userspace), which will reset the local
> + * APIC and thus smother the timer anyways, i.e. the APIC timer
"Cancel" is probably more understandable to non-native speakers, though
undoubtedly less poetic.
Reviewed-by: Paolo Bonzini <[email protected]>
Paolo
> + * IRQ(s) will be dropped no matter what.
> */
> - WARN_ON_ONCE(kvm_lapic_hv_timer_in_use(vcpu));
> -
On Thu, Aug 10, 2023, Paolo Bonzini wrote:
> On 8/9/23 01:20, Sean Christopherson wrote:
> > /*
> > - * It should be impossible for the hypervisor timer to be in
> > - * use before KVM has ever run the vCPU.
> > + * Don't bother switching APIC timer emulation from the
> > + * hypervisor timer to the software timer, the only way for the
> > + * APIC timer to be active is if userspace stuffed vCPU state,
> > + * i.e. put the vCPU into a nonsensical state. Only an INIT
> > + * will transition the vCPU out of UNINITIALIZED (without more
> > + * state stuffing from userspace), which will reset the local
> > + * APIC and thus smother the timer anyways, i.e. the APIC timer
>
> "Cancel" is probably more understandable to non-native speakers, though
> undoubtedly less poetic.
I intentionally avoided "cancel" because there is no guaranteed the timer would
actually be canceled (if KVM switched to the software timer). I am/was trying to
call out that even if the timer expires and pends an interrupts, the interrupt
will ultimately be dropped.
Maybe "squashed"?