2020-08-28 01:36:29

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH] KVM: LAPIC: Reset timer_advance_ns if timer mode switch

From: Wanpeng Li <[email protected]>

per-vCPU timer_advance_ns should be set to 0 if timer mode is not tscdeadline
otherwise we waste cpu cycles in the function lapic_timer_int_injected(),
especially on AMD platform which doesn't support tscdeadline mode. We can
reset timer_advance_ns to the initial value if switch back to tscdealine
timer mode.

Signed-off-by: Wanpeng Li <[email protected]>
---
arch/x86/kvm/lapic.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 654649b..abc296d 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1499,10 +1499,16 @@ static void apic_update_lvtt(struct kvm_lapic *apic)
kvm_lapic_set_reg(apic, APIC_TMICT, 0);
apic->lapic_timer.period = 0;
apic->lapic_timer.tscdeadline = 0;
+ if (timer_mode == APIC_LVT_TIMER_TSCDEADLINE &&
+ lapic_timer_advance_dynamic)
+ apic->lapic_timer.timer_advance_ns = LAPIC_TIMER_ADVANCE_NS_INIT;
}
apic->lapic_timer.timer_mode = timer_mode;
limit_periodic_timer_frequency(apic);
}
+ if (timer_mode != APIC_LVT_TIMER_TSCDEADLINE &&
+ lapic_timer_advance_dynamic)
+ apic->lapic_timer.timer_advance_ns = 0;
}

/*
--
2.7.4


2020-08-31 12:51:07

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH] KVM: LAPIC: Reset timer_advance_ns if timer mode switch

Wanpeng Li <[email protected]> writes:

> From: Wanpeng Li <[email protected]>
>
> per-vCPU timer_advance_ns should be set to 0 if timer mode is not tscdeadline
> otherwise we waste cpu cycles in the function lapic_timer_int_injected(),

lapic_timer_int_injected is just a test, kvm_wait_lapic_expire()
(__kvm_wait_lapic_expire()) maybe?

> especially on AMD platform which doesn't support tscdeadline mode. We can
> reset timer_advance_ns to the initial value if switch back to
> tscdealine

'tscdeadline'

> timer mode.
>
> Signed-off-by: Wanpeng Li <[email protected]>
> ---
> arch/x86/kvm/lapic.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 654649b..abc296d 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1499,10 +1499,16 @@ static void apic_update_lvtt(struct kvm_lapic *apic)
> kvm_lapic_set_reg(apic, APIC_TMICT, 0);
> apic->lapic_timer.period = 0;
> apic->lapic_timer.tscdeadline = 0;
> + if (timer_mode == APIC_LVT_TIMER_TSCDEADLINE &&
> + lapic_timer_advance_dynamic)
> + apic->lapic_timer.timer_advance_ns = LAPIC_TIMER_ADVANCE_NS_INIT;
> }
> apic->lapic_timer.timer_mode = timer_mode;
> limit_periodic_timer_frequency(apic);
> }
> + if (timer_mode != APIC_LVT_TIMER_TSCDEADLINE &&
> + lapic_timer_advance_dynamic)
> + apic->lapic_timer.timer_advance_ns = 0;
> }
>
> /*

--
Vitaly

2020-09-01 03:26:43

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH] KVM: LAPIC: Reset timer_advance_ns if timer mode switch

On Mon, 31 Aug 2020 at 20:48, Vitaly Kuznetsov <[email protected]> wrote:
>
> Wanpeng Li <[email protected]> writes:
>
> > From: Wanpeng Li <[email protected]>
> >
> > per-vCPU timer_advance_ns should be set to 0 if timer mode is not tscdeadline
> > otherwise we waste cpu cycles in the function lapic_timer_int_injected(),
>
> lapic_timer_int_injected is just a test, kvm_wait_lapic_expire()
> (__kvm_wait_lapic_expire()) maybe?

Both the check in lapic_timer_int_injected(), the check in
__kvm_wait_lapic_expire(), and these function calls, we can observe
~1.3% world switch time reduce w/ this patch by
kvm-unit-tests/vmexit.flat vmcall testing on AMD server. In addition,
I think we should set apic->lapic_timer.expired_tscdeadline to 0 when
switching between tscdeadline mode and other modes on Intel in order
that we will not waste cpu cycles to tune advance value in
adjust_lapic_timer_advance() for one time.

Wanpeng

2020-09-02 21:24:23

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH] KVM: LAPIC: Reset timer_advance_ns if timer mode switch

On Fri, Aug 28, 2020 at 09:35:08AM +0800, Wanpeng Li wrote:
> From: Wanpeng Li <[email protected]>
>
> per-vCPU timer_advance_ns should be set to 0 if timer mode is not tscdeadline
> otherwise we waste cpu cycles in the function lapic_timer_int_injected(),
> especially on AMD platform which doesn't support tscdeadline mode. We can
> reset timer_advance_ns to the initial value if switch back to tscdealine
> timer mode.
>
> Signed-off-by: Wanpeng Li <[email protected]>
> ---
> arch/x86/kvm/lapic.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 654649b..abc296d 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1499,10 +1499,16 @@ static void apic_update_lvtt(struct kvm_lapic *apic)
> kvm_lapic_set_reg(apic, APIC_TMICT, 0);
> apic->lapic_timer.period = 0;
> apic->lapic_timer.tscdeadline = 0;
> + if (timer_mode == APIC_LVT_TIMER_TSCDEADLINE &&
> + lapic_timer_advance_dynamic)

Bad indentation.

> + apic->lapic_timer.timer_advance_ns = LAPIC_TIMER_ADVANCE_NS_INIT;

Redoing the tuning seems odd. Doubt it will matter, but it feels weird to
have to retune the advancement just because the guest toggled between modes.

Rather than clear timer_advance_ns, can we simply move the check against
apic->lapic_timer.expired_tscdeadline much earlier? I think that would
solve this performance hiccup, and IMO would be a logical change in any
case. E.g. with some refactoring to avoid more duplication between VMX and
SVM:

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 35cca2e0c8026..54222f0071547 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1571,12 +1571,12 @@ static inline void adjust_lapic_timer_advance(struct kvm_vcpu *vcpu,
apic->lapic_timer.timer_advance_ns = timer_advance_ns;
}

-static void __kvm_wait_lapic_expire(struct kvm_vcpu *vcpu)
+void __kvm_wait_lapic_expire(struct kvm_vcpu *vcpu)
{
struct kvm_lapic *apic = vcpu->arch.apic;
u64 guest_tsc, tsc_deadline;

- if (apic->lapic_timer.expired_tscdeadline == 0)
+ if (!lapic_timer_int_injected(vcpu))
return;

tsc_deadline = apic->lapic_timer.expired_tscdeadline;
@@ -1590,13 +1590,7 @@ static void __kvm_wait_lapic_expire(struct kvm_vcpu *vcpu)
if (lapic_timer_advance_dynamic)
adjust_lapic_timer_advance(vcpu, apic->lapic_timer.advance_expire_delta);
}
-
-void kvm_wait_lapic_expire(struct kvm_vcpu *vcpu)
-{
- if (lapic_timer_int_injected(vcpu))
- __kvm_wait_lapic_expire(vcpu);
-}
-EXPORT_SYMBOL_GPL(kvm_wait_lapic_expire);
+EXPORT_SYMBOL_GPL(__kvm_wait_lapic_expire);

static void kvm_apic_inject_pending_timer_irqs(struct kvm_lapic *apic)
{
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 754f29beb83e3..64be9d751196a 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -236,7 +236,14 @@ static inline int kvm_lapic_latched_init(struct kvm_vcpu *vcpu)

bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);

-void kvm_wait_lapic_expire(struct kvm_vcpu *vcpu);
+void __kvm_wait_lapic_expire(struct kvm_vcpu *vcpu);
+static inline void kvm_wait_lapic_expire(struct kvm_vcpu *vcpu)
+{
+ if (lapic_in_kernel(vcpu) &&
+ vcpu->arch.apic->lapic_timer.expired_tscdeadline &&
+ vcpu->arch.apic->lapic_timer.timer_advance_ns)
+ __kvm_wait_lapic_expire(vcpu);
+}

void kvm_bitmap_or_dest_vcpus(struct kvm *kvm, struct kvm_lapic_irq *irq,
unsigned long *vcpu_bitmap);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index eee7edcbe7491..dfe505a7304a3 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3456,9 +3456,7 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
clgi();
kvm_load_guest_xsave_state(vcpu);

- if (lapic_in_kernel(vcpu) &&
- vcpu->arch.apic->lapic_timer.timer_advance_ns)
- kvm_wait_lapic_expire(vcpu);
+ kvm_wait_lapic_expire(vcpu);

/*
* If this vCPU has touched SPEC_CTRL, restore the guest's value if


> }
> apic->lapic_timer.timer_mode = timer_mode;
> limit_periodic_timer_frequency(apic);
> }
> + if (timer_mode != APIC_LVT_TIMER_TSCDEADLINE &&
> + lapic_timer_advance_dynamic)

Bad indentation.

> + apic->lapic_timer.timer_advance_ns = 0;
> }
>
> /*
> --
> 2.7.4
>

2020-09-03 10:58:44

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH] KVM: LAPIC: Reset timer_advance_ns if timer mode switch

On Thu, 3 Sep 2020 at 05:23, Sean Christopherson
<[email protected]> wrote:
>
> On Fri, Aug 28, 2020 at 09:35:08AM +0800, Wanpeng Li wrote:
> > From: Wanpeng Li <[email protected]>
> >
> > per-vCPU timer_advance_ns should be set to 0 if timer mode is not tscdeadline
> > otherwise we waste cpu cycles in the function lapic_timer_int_injected(),
> > especially on AMD platform which doesn't support tscdeadline mode. We can
> > reset timer_advance_ns to the initial value if switch back to tscdealine
> > timer mode.
> >
> > Signed-off-by: Wanpeng Li <[email protected]>
> > ---
> > arch/x86/kvm/lapic.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > index 654649b..abc296d 100644
> > --- a/arch/x86/kvm/lapic.c
> > +++ b/arch/x86/kvm/lapic.c
> > @@ -1499,10 +1499,16 @@ static void apic_update_lvtt(struct kvm_lapic *apic)
> > kvm_lapic_set_reg(apic, APIC_TMICT, 0);
> > apic->lapic_timer.period = 0;
> > apic->lapic_timer.tscdeadline = 0;
> > + if (timer_mode == APIC_LVT_TIMER_TSCDEADLINE &&
> > + lapic_timer_advance_dynamic)
>
> Bad indentation.
>
> > + apic->lapic_timer.timer_advance_ns = LAPIC_TIMER_ADVANCE_NS_INIT;
>
> Redoing the tuning seems odd. Doubt it will matter, but it feels weird to
> have to retune the advancement just because the guest toggled between modes.
>
> Rather than clear timer_advance_ns, can we simply move the check against
> apic->lapic_timer.expired_tscdeadline much earlier? I think that would
> solve this performance hiccup, and IMO would be a logical change in any
> case. E.g. with some refactoring to avoid more duplication between VMX and
> SVM

How about something like below:

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 3b32d3b..51ed4f0 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1582,9 +1582,6 @@ static void __kvm_wait_lapic_expire(struct kvm_vcpu *vcpu)
struct kvm_lapic *apic = vcpu->arch.apic;
u64 guest_tsc, tsc_deadline;

- if (apic->lapic_timer.expired_tscdeadline == 0)
- return;
-
tsc_deadline = apic->lapic_timer.expired_tscdeadline;
apic->lapic_timer.expired_tscdeadline = 0;
guest_tsc = kvm_read_l1_tsc(vcpu, rdtsc());
@@ -1599,7 +1596,10 @@ static void __kvm_wait_lapic_expire(struct
kvm_vcpu *vcpu)

void kvm_wait_lapic_expire(struct kvm_vcpu *vcpu)
{
- if (lapic_timer_int_injected(vcpu))
+ if (lapic_in_kernel(vcpu) &&
+ vcpu->arch.apic->lapic_timer.expired_tscdeadline &&
+ vcpu->arch.apic->lapic_timer.timer_advance_ns &&
+ lapic_timer_int_injected(vcpu))
__kvm_wait_lapic_expire(vcpu);
}
EXPORT_SYMBOL_GPL(kvm_wait_lapic_expire);
@@ -1635,8 +1635,7 @@ static void apic_timer_expired(struct kvm_lapic
*apic, bool from_timer_fn)
}

if (kvm_use_posted_timer_interrupt(apic->vcpu)) {
- if (apic->lapic_timer.timer_advance_ns)
- __kvm_wait_lapic_expire(vcpu);
+ kvm_wait_lapic_expire(vcpu);
kvm_apic_inject_pending_timer_irqs(apic);
return;
}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 0194336..19e622a 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3456,9 +3456,7 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct
kvm_vcpu *vcpu)
clgi();
kvm_load_guest_xsave_state(vcpu);

- if (lapic_in_kernel(vcpu) &&
- vcpu->arch.apic->lapic_timer.timer_advance_ns)
- kvm_wait_lapic_expire(vcpu);
+ kvm_wait_lapic_expire(vcpu);

/*
* If this vCPU has touched SPEC_CTRL, restore the guest's value if
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index a544351..d6e1656 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6800,9 +6800,7 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
if (enable_preemption_timer)
vmx_update_hv_timer(vcpu);

- if (lapic_in_kernel(vcpu) &&
- vcpu->arch.apic->lapic_timer.timer_advance_ns)
- kvm_wait_lapic_expire(vcpu);
+ kvm_wait_lapic_expire(vcpu);

/*
* If this vCPU has touched SPEC_CTRL, restore the guest's value if

2020-09-04 16:07:46

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH] KVM: LAPIC: Reset timer_advance_ns if timer mode switch

On Thu, Sep 03, 2020 at 06:57:00PM +0800, Wanpeng Li wrote:
> On Thu, 3 Sep 2020 at 05:23, Sean Christopherson
> <[email protected]> wrote:
> >
> > On Fri, Aug 28, 2020 at 09:35:08AM +0800, Wanpeng Li wrote:
> > > From: Wanpeng Li <[email protected]>
> > >
> > > per-vCPU timer_advance_ns should be set to 0 if timer mode is not tscdeadline
> > > otherwise we waste cpu cycles in the function lapic_timer_int_injected(),
> > > especially on AMD platform which doesn't support tscdeadline mode. We can
> > > reset timer_advance_ns to the initial value if switch back to tscdealine
> > > timer mode.
> > >
> > > Signed-off-by: Wanpeng Li <[email protected]>
> > > ---
> > > arch/x86/kvm/lapic.c | 6 ++++++
> > > 1 file changed, 6 insertions(+)
> > >
> > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > > index 654649b..abc296d 100644
> > > --- a/arch/x86/kvm/lapic.c
> > > +++ b/arch/x86/kvm/lapic.c
> > > @@ -1499,10 +1499,16 @@ static void apic_update_lvtt(struct kvm_lapic *apic)
> > > kvm_lapic_set_reg(apic, APIC_TMICT, 0);
> > > apic->lapic_timer.period = 0;
> > > apic->lapic_timer.tscdeadline = 0;
> > > + if (timer_mode == APIC_LVT_TIMER_TSCDEADLINE &&
> > > + lapic_timer_advance_dynamic)
> >
> > Bad indentation.
> >
> > > + apic->lapic_timer.timer_advance_ns = LAPIC_TIMER_ADVANCE_NS_INIT;
> >
> > Redoing the tuning seems odd. Doubt it will matter, but it feels weird to
> > have to retune the advancement just because the guest toggled between modes.
> >
> > Rather than clear timer_advance_ns, can we simply move the check against
> > apic->lapic_timer.expired_tscdeadline much earlier? I think that would
> > solve this performance hiccup, and IMO would be a logical change in any
> > case. E.g. with some refactoring to avoid more duplication between VMX and
> > SVM
>
> How about something like below:

That works too. The only reason I used the inline shenanigans was to avoid
the CALL+RET in VM-Enter when the timer hasn't expired.

> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 3b32d3b..51ed4f0 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1582,9 +1582,6 @@ static void __kvm_wait_lapic_expire(struct kvm_vcpu *vcpu)
> struct kvm_lapic *apic = vcpu->arch.apic;
> u64 guest_tsc, tsc_deadline;
>
> - if (apic->lapic_timer.expired_tscdeadline == 0)
> - return;
> -
> tsc_deadline = apic->lapic_timer.expired_tscdeadline;
> apic->lapic_timer.expired_tscdeadline = 0;
> guest_tsc = kvm_read_l1_tsc(vcpu, rdtsc());
> @@ -1599,7 +1596,10 @@ static void __kvm_wait_lapic_expire(struct
> kvm_vcpu *vcpu)
>
> void kvm_wait_lapic_expire(struct kvm_vcpu *vcpu)
> {
> - if (lapic_timer_int_injected(vcpu))
> + if (lapic_in_kernel(vcpu) &&
> + vcpu->arch.apic->lapic_timer.expired_tscdeadline &&
> + vcpu->arch.apic->lapic_timer.timer_advance_ns &&
> + lapic_timer_int_injected(vcpu))
> __kvm_wait_lapic_expire(vcpu);
> }
> EXPORT_SYMBOL_GPL(kvm_wait_lapic_expire);
> @@ -1635,8 +1635,7 @@ static void apic_timer_expired(struct kvm_lapic
> *apic, bool from_timer_fn)
> }
>
> if (kvm_use_posted_timer_interrupt(apic->vcpu)) {
> - if (apic->lapic_timer.timer_advance_ns)
> - __kvm_wait_lapic_expire(vcpu);
> + kvm_wait_lapic_expire(vcpu);
> kvm_apic_inject_pending_timer_irqs(apic);
> return;
> }
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 0194336..19e622a 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -3456,9 +3456,7 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct
> kvm_vcpu *vcpu)
> clgi();
> kvm_load_guest_xsave_state(vcpu);
>
> - if (lapic_in_kernel(vcpu) &&
> - vcpu->arch.apic->lapic_timer.timer_advance_ns)
> - kvm_wait_lapic_expire(vcpu);
> + kvm_wait_lapic_expire(vcpu);
>
> /*
> * If this vCPU has touched SPEC_CTRL, restore the guest's value if
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index a544351..d6e1656 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6800,9 +6800,7 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
> if (enable_preemption_timer)
> vmx_update_hv_timer(vcpu);
>
> - if (lapic_in_kernel(vcpu) &&
> - vcpu->arch.apic->lapic_timer.timer_advance_ns)
> - kvm_wait_lapic_expire(vcpu);
> + kvm_wait_lapic_expire(vcpu);
>
> /*
> * If this vCPU has touched SPEC_CTRL, restore the guest's value if