When processing KVM_REQ_EVENT, apic_update_ppr is called which may set
KVM_REQ_EVENT again if the recalculated value of PPR becomes smaller
than the previous one. This results in cancelling the guest entry and
reiterating in vcpu_enter_guest.
However this is unnecessary because at this point KVM_REQ_EVENT is
already being processed and there are no other changes in the lapic
that may require full-fledged state recalculation.
This situation is often hit on systems with TPR shadow, where the
TPR can be updated by the guest without a vmexit, so that the first
apic_update_ppr to notice it is exactly the one called while
processing KVM_REQ_EVENT.
To avoid it, introduce a parameter in apic_update_ppr allowing to
suppress setting of KVM_REQ_EVENT, and use it on the paths called from
KVM_REQ_EVENT processing.
This microoptimization gives 10% performance increase on a synthetic
test doing a lot of IPC in Windows using window messages.
Reviewed-by: Roman Kagan <[email protected]>
Signed-off-by: Denis Plotnikov <[email protected]>
---
arch/x86/kvm/lapic.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 6f69340..b3025d8 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -544,7 +544,7 @@ static void pv_eoi_clr_pending(struct kvm_vcpu *vcpu)
__clear_bit(KVM_APIC_PV_EOI_PENDING, &vcpu->arch.apic_attention);
}
-static void apic_update_ppr(struct kvm_lapic *apic)
+static void apic_update_ppr(struct kvm_lapic *apic, bool make_req)
{
u32 tpr, isrv, ppr, old_ppr;
int isr;
@@ -564,7 +564,7 @@ static void apic_update_ppr(struct kvm_lapic *apic)
if (old_ppr != ppr) {
kvm_lapic_set_reg(apic, APIC_PROCPRI, ppr);
- if (ppr < old_ppr)
+ if (make_req && ppr < old_ppr)
kvm_make_request(KVM_REQ_EVENT, apic->vcpu);
}
}
@@ -572,7 +572,7 @@ static void apic_update_ppr(struct kvm_lapic *apic)
static void apic_set_tpr(struct kvm_lapic *apic, u32 tpr)
{
kvm_lapic_set_reg(apic, APIC_TASKPRI, tpr);
- apic_update_ppr(apic);
+ apic_update_ppr(apic, true);
}
static bool kvm_apic_broadcast(struct kvm_lapic *apic, u32 mda)
@@ -1032,7 +1032,7 @@ static int apic_set_eoi(struct kvm_lapic *apic)
return vector;
apic_clear_isr(vector, apic);
- apic_update_ppr(apic);
+ apic_update_ppr(apic, true);
if (test_bit(vector, vcpu_to_synic(apic->vcpu)->vec_bitmap))
kvm_hv_synic_send_eoi(apic->vcpu, vector);
@@ -1147,7 +1147,7 @@ static u32 __apic_read(struct kvm_lapic *apic, unsigned int offset)
val = apic_get_tmcct(apic);
break;
case APIC_PROCPRI:
- apic_update_ppr(apic);
+ apic_update_ppr(apic, true);
val = kvm_lapic_get_reg(apic, offset);
break;
case APIC_TASKPRI:
@@ -1841,7 +1841,7 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event)
kvm_lapic_set_base(vcpu,
vcpu->arch.apic_base | MSR_IA32_APICBASE_BSP);
vcpu->arch.pv_eoi.msr_val = 0;
- apic_update_ppr(apic);
+ apic_update_ppr(apic, true);
vcpu->arch.apic_arb_prio = 0;
vcpu->arch.apic_attention = 0;
@@ -1964,7 +1964,7 @@ int kvm_apic_has_interrupt(struct kvm_vcpu *vcpu)
if (!apic_enabled(apic))
return -1;
- apic_update_ppr(apic);
+ apic_update_ppr(apic, false);
highest_irr = apic_find_highest_irr(apic);
if ((highest_irr == -1) ||
((highest_irr & 0xF0) <= kvm_lapic_get_reg(apic, APIC_PROCPRI)))
@@ -2013,12 +2013,12 @@ int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu)
*/
apic_set_isr(vector, apic);
- apic_update_ppr(apic);
+ apic_update_ppr(apic, true);
apic_clear_irr(vector, apic);
if (test_bit(vector, vcpu_to_synic(vcpu)->auto_eoi_bitmap)) {
apic_clear_isr(vector, apic);
- apic_update_ppr(apic);
+ apic_update_ppr(apic, true);
}
return vector;
@@ -2068,7 +2068,7 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
recalculate_apic_map(vcpu->kvm);
kvm_apic_set_version(vcpu);
- apic_update_ppr(apic);
+ apic_update_ppr(apic, true);
hrtimer_cancel(&apic->lapic_timer.timer);
apic_update_lvtt(apic);
apic_manage_nmi_watchdog(apic, kvm_lapic_get_reg(apic, APIC_LVT0));
--
2.10.1.352.g0cf3611
2016-12-12 17:02+0300, Denis Plotnikov:
> When processing KVM_REQ_EVENT, apic_update_ppr is called which may set
> KVM_REQ_EVENT again if the recalculated value of PPR becomes smaller
> than the previous one. This results in cancelling the guest entry and
> reiterating in vcpu_enter_guest.
>
> However this is unnecessary because at this point KVM_REQ_EVENT is
> already being processed and there are no other changes in the lapic
> that may require full-fledged state recalculation.
>
> This situation is often hit on systems with TPR shadow, where the
> TPR can be updated by the guest without a vmexit, so that the first
> apic_update_ppr to notice it is exactly the one called while
> processing KVM_REQ_EVENT.
>
> To avoid it, introduce a parameter in apic_update_ppr allowing to
> suppress setting of KVM_REQ_EVENT, and use it on the paths called from
> KVM_REQ_EVENT processing.
We also call:
kvm_cpu_get_interrupt() in nested_vmx_vmexit()
- that path is intended without KVM_REQ_EVENT
kvm_cpu_has_interrupt() in vmx_check_nested_events(),
- I think it does no harm
kvm_cpu_has_interrupt() in kvm_vcpu_has_events()
kvm_cpu_has_interrupt() in kvm_vcpu_ready_for_interrupt_injection()
- both seem safe as we should not have an interrupt between TPR
threshold and the new PPR value, so the KVM_REQ_EVENT was useless.
I would prefer we made sure that only callers from KVM_REQ_EVENT used
the function we are changing -- it is really easy to make a hard-to-find
mistake in interrupt delivery.
> This microoptimization gives 10% performance increase on a synthetic
> test doing a lot of IPC in Windows using window messages.
>
> Reviewed-by: Roman Kagan <[email protected]>
> Signed-off-by: Denis Plotnikov <[email protected]>
> ---
Still, there is a high possibility that this is going to work,
Reviewed-by: Radim Krčmář <[email protected]>
On Mon, Dec 12, 2016 at 05:29:43PM +0100, Radim Krčmář wrote:
> 2016-12-12 17:02+0300, Denis Plotnikov:
> > When processing KVM_REQ_EVENT, apic_update_ppr is called which may set
> > KVM_REQ_EVENT again if the recalculated value of PPR becomes smaller
> > than the previous one. This results in cancelling the guest entry and
> > reiterating in vcpu_enter_guest.
> >
> > However this is unnecessary because at this point KVM_REQ_EVENT is
> > already being processed and there are no other changes in the lapic
> > that may require full-fledged state recalculation.
> >
> > This situation is often hit on systems with TPR shadow, where the
> > TPR can be updated by the guest without a vmexit, so that the first
> > apic_update_ppr to notice it is exactly the one called while
> > processing KVM_REQ_EVENT.
> >
> > To avoid it, introduce a parameter in apic_update_ppr allowing to
> > suppress setting of KVM_REQ_EVENT, and use it on the paths called from
> > KVM_REQ_EVENT processing.
>
> We also call:
>
> kvm_cpu_get_interrupt() in nested_vmx_vmexit()
> - that path is intended without KVM_REQ_EVENT
> kvm_cpu_has_interrupt() in vmx_check_nested_events(),
> - I think it does no harm
> kvm_cpu_has_interrupt() in kvm_vcpu_has_events()
> kvm_cpu_has_interrupt() in kvm_vcpu_ready_for_interrupt_injection()
> - both seem safe as we should not have an interrupt between TPR
> threshold and the new PPR value, so the KVM_REQ_EVENT was useless.
>
> I would prefer we made sure that only callers from KVM_REQ_EVENT used
> the function we are changing -- it is really easy to make a hard-to-find
> mistake in interrupt delivery.
Indeed, that was my concern as well. How about introducing a parameter
to kvm_cpu_{has,get}_interrupt() with the same meaning, and pass it down
to apic_update_ppr()? Then only the call sites under KVM_REQ_EVENT
processing would pass "false" there, and the rest would remain with
"true"?
Roman.
2016-12-12 23:20+0300, Roman Kagan:
> On Mon, Dec 12, 2016 at 05:29:43PM +0100, Radim Krčmář wrote:
>> 2016-12-12 17:02+0300, Denis Plotnikov:
>> > When processing KVM_REQ_EVENT, apic_update_ppr is called which may set
>> > KVM_REQ_EVENT again if the recalculated value of PPR becomes smaller
>> > than the previous one. This results in cancelling the guest entry and
>> > reiterating in vcpu_enter_guest.
>> >
>> > However this is unnecessary because at this point KVM_REQ_EVENT is
>> > already being processed and there are no other changes in the lapic
>> > that may require full-fledged state recalculation.
>> >
>> > This situation is often hit on systems with TPR shadow, where the
>> > TPR can be updated by the guest without a vmexit, so that the first
>> > apic_update_ppr to notice it is exactly the one called while
>> > processing KVM_REQ_EVENT.
>> >
>> > To avoid it, introduce a parameter in apic_update_ppr allowing to
>> > suppress setting of KVM_REQ_EVENT, and use it on the paths called from
>> > KVM_REQ_EVENT processing.
>>
>> We also call:
>>
>> kvm_cpu_get_interrupt() in nested_vmx_vmexit()
>> - that path is intended without KVM_REQ_EVENT
>> kvm_cpu_has_interrupt() in vmx_check_nested_events(),
>> - I think it does no harm
>> kvm_cpu_has_interrupt() in kvm_vcpu_has_events()
>> kvm_cpu_has_interrupt() in kvm_vcpu_ready_for_interrupt_injection()
>> - both seem safe as we should not have an interrupt between TPR
>> threshold and the new PPR value, so the KVM_REQ_EVENT was useless.
>>
>> I would prefer we made sure that only callers from KVM_REQ_EVENT used
>> the function we are changing -- it is really easy to make a hard-to-find
>> mistake in interrupt delivery.
>
> Indeed, that was my concern as well. How about introducing a parameter
> to kvm_cpu_{has,get}_interrupt() with the same meaning, and pass it down
> to apic_update_ppr()? Then only the call sites under KVM_REQ_EVENT
> processing would pass "false" there, and the rest would remain with
> "true"?
Sounds good.
I though about some other solutions and it looks like we actually don't
need KVM_REQ_EVENT almost anywhere when using TPR shadow:
If we didn't get the TPR VM exit, then we know that there is no
interrupt that can be delivered after applying the change from TPR.
(In other words, if we had a queued interrupt that got unmasked by the
change, then it should have trigerred the TPR threshold VM exit.)
And KVM must change TPR without TPR shadow, so we would learn about the
change earlier, then.
I think we could only trigger KVM_REQ_EVENT when lowering TPR without
TPR shadow. Your patch is definitely safer. :)