2019-08-09 05:46:46

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH] KVM: LAPIC: Periodically revaluate appropriate lapic_timer_advance_ns

From: Wanpeng Li <[email protected]>

Even if for realtime CPUs, cache line bounces, frequency scaling, presence
of higher-priority RT tasks, etc can cause different response. These
interferences should be considered and periodically revaluate whether
or not the lapic_timer_advance_ns value is the best, do nothing if it is,
otherwise recaluate again.

Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
arch/x86/kvm/lapic.c | 16 +++++++++++++++-
arch/x86/kvm/lapic.h | 1 +
2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index df5cd07..8b62008 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -69,6 +69,7 @@
#define LAPIC_TIMER_ADVANCE_ADJUST_INIT 1000
/* step-by-step approximation to mitigate fluctuation */
#define LAPIC_TIMER_ADVANCE_ADJUST_STEP 8
+#define LAPIC_TIMER_ADVANCE_RECALC_PERIOD (600 * HZ)

static inline int apic_test_vector(int vec, void *bitmap)
{
@@ -1484,6 +1485,17 @@ static inline void adjust_lapic_timer_advance(struct kvm_vcpu *vcpu,
u32 timer_advance_ns = apic->lapic_timer.timer_advance_ns;
u64 ns;

+ /* periodic revaluate */
+ if (unlikely(apic->lapic_timer.timer_advance_adjust_done)) {
+ apic->lapic_timer.recalc_timer_advance_ns = jiffies +
+ LAPIC_TIMER_ADVANCE_RECALC_PERIOD;
+ if (abs(advance_expire_delta) > LAPIC_TIMER_ADVANCE_ADJUST_DONE) {
+ timer_advance_ns = LAPIC_TIMER_ADVANCE_ADJUST_INIT;
+ apic->lapic_timer.timer_advance_adjust_done = false;
+ } else
+ return;
+ }
+
/* too early */
if (advance_expire_delta < 0) {
ns = -advance_expire_delta * 1000000ULL;
@@ -1523,7 +1535,8 @@ static void __kvm_wait_lapic_expire(struct kvm_vcpu *vcpu)
if (guest_tsc < tsc_deadline)
__wait_lapic_expire(vcpu, tsc_deadline - guest_tsc);

- if (unlikely(!apic->lapic_timer.timer_advance_adjust_done))
+ if (unlikely(!apic->lapic_timer.timer_advance_adjust_done) ||
+ time_before(apic->lapic_timer.recalc_timer_advance_ns, jiffies))
adjust_lapic_timer_advance(vcpu, apic->lapic_timer.advance_expire_delta);
}

@@ -2301,6 +2314,7 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu, int timer_advance_ns)
if (timer_advance_ns == -1) {
apic->lapic_timer.timer_advance_ns = LAPIC_TIMER_ADVANCE_ADJUST_INIT;
apic->lapic_timer.timer_advance_adjust_done = false;
+ apic->lapic_timer.recalc_timer_advance_ns = jiffies;
} else {
apic->lapic_timer.timer_advance_ns = timer_advance_ns;
apic->lapic_timer.timer_advance_adjust_done = true;
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 50053d2..31ced36 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -31,6 +31,7 @@ struct kvm_timer {
u32 timer_mode_mask;
u64 tscdeadline;
u64 expired_tscdeadline;
+ unsigned long recalc_timer_advance_ns;
u32 timer_advance_ns;
s64 advance_expire_delta;
atomic_t pending; /* accumulated triggered timers */
--
2.7.4


2019-08-09 10:26:56

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH] KVM: LAPIC: Periodically revaluate appropriate lapic_timer_advance_ns

On 09/08/19 07:45, Wanpeng Li wrote:
> From: Wanpeng Li <[email protected]>
>
> Even if for realtime CPUs, cache line bounces, frequency scaling, presence
> of higher-priority RT tasks, etc can cause different response. These
> interferences should be considered and periodically revaluate whether
> or not the lapic_timer_advance_ns value is the best, do nothing if it is,
> otherwise recaluate again.

How much fluctuation do you observe between different runs?

Paolo

2019-08-12 09:08:36

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH] KVM: LAPIC: Periodically revaluate appropriate lapic_timer_advance_ns

On Fri, 9 Aug 2019 at 18:24, Paolo Bonzini <[email protected]> wrote:
>
> On 09/08/19 07:45, Wanpeng Li wrote:
> > From: Wanpeng Li <[email protected]>
> >
> > Even if for realtime CPUs, cache line bounces, frequency scaling, presence
> > of higher-priority RT tasks, etc can cause different response. These
> > interferences should be considered and periodically revaluate whether
> > or not the lapic_timer_advance_ns value is the best, do nothing if it is,
> > otherwise recaluate again.
>
> How much fluctuation do you observe between different runs?

Sometimes can ~1000 cycles after converting to guest tsc freq.

Regards,
Wanpeng Li

2019-08-14 12:51:04

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH] KVM: LAPIC: Periodically revaluate appropriate lapic_timer_advance_ns

On 12/08/19 11:06, Wanpeng Li wrote:
> On Fri, 9 Aug 2019 at 18:24, Paolo Bonzini <[email protected]> wrote:
>>
>> On 09/08/19 07:45, Wanpeng Li wrote:
>>> From: Wanpeng Li <[email protected]>
>>>
>>> Even if for realtime CPUs, cache line bounces, frequency scaling, presence
>>> of higher-priority RT tasks, etc can cause different response. These
>>> interferences should be considered and periodically revaluate whether
>>> or not the lapic_timer_advance_ns value is the best, do nothing if it is,
>>> otherwise recaluate again.
>>
>> How much fluctuation do you observe between different runs?
>
> Sometimes can ~1000 cycles after converting to guest tsc freq.

Hmm, I wonder if we need some kind of continuous smoothing. Something like

if (abs(advance_expire_delta) < LAPIC_TIMER_ADVANCE_ADJUST_DONE) {
/* no update for random fluctuations */
return;
}

if (unlikely(timer_advance_ns > 5000))
timer_advance_ns = LAPIC_TIMER_ADVANCE_ADJUST_INIT;
apic->lapic_timer.timer_advance_ns = timer_advance_ns;

and removing all the timer_advance_adjust_done stuff. What do you think?

Paolo

2019-08-15 04:07:47

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH] KVM: LAPIC: Periodically revaluate appropriate lapic_timer_advance_ns

On Wed, 14 Aug 2019 at 20:50, Paolo Bonzini <[email protected]> wrote:
>
> On 12/08/19 11:06, Wanpeng Li wrote:
> > On Fri, 9 Aug 2019 at 18:24, Paolo Bonzini <[email protected]> wrote:
> >>
> >> On 09/08/19 07:45, Wanpeng Li wrote:
> >>> From: Wanpeng Li <[email protected]>
> >>>
> >>> Even if for realtime CPUs, cache line bounces, frequency scaling, presence
> >>> of higher-priority RT tasks, etc can cause different response. These
> >>> interferences should be considered and periodically revaluate whether
> >>> or not the lapic_timer_advance_ns value is the best, do nothing if it is,
> >>> otherwise recaluate again.
> >>
> >> How much fluctuation do you observe between different runs?
> >
> > Sometimes can ~1000 cycles after converting to guest tsc freq.
>
> Hmm, I wonder if we need some kind of continuous smoothing. Something like

Actually this can fluctuate drastically instead of continuous
smoothing during testing (running linux guest instead of
kvm-unit-tests).

>
> if (abs(advance_expire_delta) < LAPIC_TIMER_ADVANCE_ADJUST_DONE) {
> /* no update for random fluctuations */
> return;
> }
>
> if (unlikely(timer_advance_ns > 5000))
> timer_advance_ns = LAPIC_TIMER_ADVANCE_ADJUST_INIT;
> apic->lapic_timer.timer_advance_ns = timer_advance_ns;
>
> and removing all the timer_advance_adjust_done stuff. What do you think?

I just sent out v2, periodically revaluate and get a minimal
conservative value from these revaluate points. Please have a look. :)

Regards,
Wanpeng Li