2016-10-24 10:23:55

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH v3 0/5] KVM: LAPIC: Add APIC Timer periodic/oneshot mode VMX preemption timer support

Most windows guests which I have on hand currently still utilize APIC Timer
periodic/oneshot mode instead of APIC Timer tsc-deadline mode:
- windows 2008 server r2
- windows 2012 server r2
- windows 7
- windows 10

This patchset adds the APIC Timer periodic/oneshot mode VMX preemption
timer support.

I test the patchset by adding kvm-unit-test/apic_timer_latency.flat to test
100 thousands times APIC timer operations.

The patchset reduces ~5600+ cycles for each APIC timer periodic mode operation
virtualization.

v3 -> v2:
* fix windows server verison guest boot hang (by adding period check in
start_sw_period and kvm_lapic_expired_hv_timer)

v2 -> v1:
* replace ktime_compare by ktime_after
* introduce advance_periodic_target_expiration() and use it in kvm_lapic_expired_hv_timer()
and in apic_timer_fn()
* don't zero apic->lapic_timer.target_expiration of apic_lvtt_period() when
!kvm_lapic_hv_timer_in_use

v1 -> RFC v3:
* squash patch 3 into patch 6
* fix switch sw to hv timer

RFC v2 -> RFC v3:
* remove kvm_lapic_hv_timer_in_use() check in apic_get_tmcc, replace
the hritmer_get_remaining() by target_expiration - now
* rename expired_period to target_expiration
* introduce set_target_expiration() helper
* move the checking of minimal period to set_target_expiration()
* cleanup kvm_get_lapic_target_deadline_tsc()

RFC v1 -> RFC v2:
* remember the timeout when setting up the timer to get a correct TMCCT
* move apic->lapic_timer.period/tscdeadline caculations to start_apic_timer()


Wanpeng Li (5):
KVM: LAPIC: extract start_sw_period() to handle periodic/oneshot mode
KVM: LAPIC: guarantee the timer is in tsc-deadline mode
KVM: LAPIC: introduce kvm_get_lapic_target_expiration_tsc()
KVM: LAPIC: rename start/cancel_hv_tscdeadline to start/cancel_hv_timer
KVM: LAPIC: add APIC Timer periodic/oneshot mode VMX preemption timer support

arch/x86/kvm/lapic.c | 207 ++++++++++++++++++++++++++++++++++-----------------
arch/x86/kvm/lapic.h | 2 +
arch/x86/kvm/x86.c | 2 +-
3 files changed, 143 insertions(+), 68 deletions(-)

--
1.9.1


2016-10-24 10:23:59

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH v3 1/5] KVM: LAPIC: extract start_sw_period() to handle periodic/oneshot mode

From: Wanpeng Li <[email protected]>

Extract start_sw_period() to handle periodic/oneshot mode, it will be
used by later patch.

Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Yunhong Jiang <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
arch/x86/kvm/lapic.c | 89 +++++++++++++++++++++++++++-------------------------
1 file changed, 47 insertions(+), 42 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 23b99f3..dad743e 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1347,6 +1347,50 @@ static void start_sw_tscdeadline(struct kvm_lapic *apic)
local_irq_restore(flags);
}

+static void start_sw_period(struct kvm_lapic *apic)
+{
+ ktime_t now;
+
+ /* lapic timer in oneshot or periodic mode */
+ now = apic->lapic_timer.timer.base->get_time();
+ apic->lapic_timer.period = (u64)kvm_lapic_get_reg(apic, APIC_TMICT)
+ * APIC_BUS_CYCLE_NS * apic->divide_count;
+
+ if (!apic->lapic_timer.period)
+ return;
+ /*
+ * Do not allow the guest to program periodic timers with small
+ * interval, since the hrtimers are not throttled by the host
+ * scheduler.
+ */
+ if (apic_lvtt_period(apic)) {
+ s64 min_period = min_timer_period_us * 1000LL;
+
+ if (apic->lapic_timer.period < min_period) {
+ pr_info_ratelimited(
+ "kvm: vcpu %i: requested %lld ns "
+ "lapic timer period limited to %lld ns\n",
+ apic->vcpu->vcpu_id,
+ apic->lapic_timer.period, min_period);
+ apic->lapic_timer.period = min_period;
+ }
+ }
+
+ hrtimer_start(&apic->lapic_timer.timer,
+ ktime_add_ns(now, apic->lapic_timer.period),
+ HRTIMER_MODE_ABS_PINNED);
+
+ apic_debug("%s: bus cycle is %" PRId64 "ns, now 0x%016"
+ PRIx64 ", "
+ "timer initial count 0x%x, period %lldns, "
+ "expire @ 0x%016" PRIx64 ".\n", __func__,
+ APIC_BUS_CYCLE_NS, ktime_to_ns(now),
+ kvm_lapic_get_reg(apic, APIC_TMICT),
+ apic->lapic_timer.period,
+ ktime_to_ns(ktime_add_ns(now,
+ apic->lapic_timer.period)));
+}
+
bool kvm_lapic_hv_timer_in_use(struct kvm_vcpu *vcpu)
{
if (!lapic_in_kernel(vcpu))
@@ -1424,50 +1468,11 @@ void kvm_lapic_switch_to_sw_timer(struct kvm_vcpu *vcpu)

static void start_apic_timer(struct kvm_lapic *apic)
{
- ktime_t now;
-
atomic_set(&apic->lapic_timer.pending, 0);

- if (apic_lvtt_period(apic) || apic_lvtt_oneshot(apic)) {
- /* lapic timer in oneshot or periodic mode */
- now = apic->lapic_timer.timer.base->get_time();
- apic->lapic_timer.period = (u64)kvm_lapic_get_reg(apic, APIC_TMICT)
- * APIC_BUS_CYCLE_NS * apic->divide_count;
-
- if (!apic->lapic_timer.period)
- return;
- /*
- * Do not allow the guest to program periodic timers with small
- * interval, since the hrtimers are not throttled by the host
- * scheduler.
- */
- if (apic_lvtt_period(apic)) {
- s64 min_period = min_timer_period_us * 1000LL;
-
- if (apic->lapic_timer.period < min_period) {
- pr_info_ratelimited(
- "kvm: vcpu %i: requested %lld ns "
- "lapic timer period limited to %lld ns\n",
- apic->vcpu->vcpu_id,
- apic->lapic_timer.period, min_period);
- apic->lapic_timer.period = min_period;
- }
- }
-
- hrtimer_start(&apic->lapic_timer.timer,
- ktime_add_ns(now, apic->lapic_timer.period),
- HRTIMER_MODE_ABS_PINNED);
-
- apic_debug("%s: bus cycle is %" PRId64 "ns, now 0x%016"
- PRIx64 ", "
- "timer initial count 0x%x, period %lldns, "
- "expire @ 0x%016" PRIx64 ".\n", __func__,
- APIC_BUS_CYCLE_NS, ktime_to_ns(now),
- kvm_lapic_get_reg(apic, APIC_TMICT),
- apic->lapic_timer.period,
- ktime_to_ns(ktime_add_ns(now,
- apic->lapic_timer.period)));
- } else if (apic_lvtt_tscdeadline(apic)) {
+ if (apic_lvtt_period(apic) || apic_lvtt_oneshot(apic))
+ start_sw_period(apic);
+ else if (apic_lvtt_tscdeadline(apic)) {
if (!(kvm_x86_ops->set_hv_timer && start_hv_tscdeadline(apic)))
start_sw_tscdeadline(apic);
}
--
1.9.1

2016-10-24 10:24:05

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH v3 3/5] KVM: LAPIC: introduce kvm_get_lapic_target_expiration_tsc()

From: Wanpeng Li <[email protected]>

Introdce kvm_get_lapic_target_expiration_tsc() to get APIC Timer target
deadline tsc.

Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Yunhong Jiang <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
arch/x86/kvm/lapic.c | 9 +++++++++
arch/x86/kvm/lapic.h | 1 +
arch/x86/kvm/x86.c | 2 +-
3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index dce6c0b..531fc28 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1706,6 +1706,15 @@ void kvm_free_lapic(struct kvm_vcpu *vcpu)
* LAPIC interface
*----------------------------------------------------------------------
*/
+u64 kvm_get_lapic_target_expiration_tsc(struct kvm_vcpu *vcpu)
+{
+ struct kvm_lapic *apic = vcpu->arch.apic;
+
+ if (!lapic_in_kernel(vcpu))
+ return 0;
+
+ return apic->lapic_timer.tscdeadline;
+}

u64 kvm_get_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu)
{
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index f60d01c..031db26 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -85,6 +85,7 @@ bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic *src,
int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s);
int kvm_lapic_find_highest_irr(struct kvm_vcpu *vcpu);

+u64 kvm_get_lapic_target_expiration_tsc(struct kvm_vcpu *vcpu);
u64 kvm_get_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu);
void kvm_set_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu, u64 data);

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index cd22dda..2a33d98 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2794,7 +2794,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
}
if (kvm_lapic_hv_timer_in_use(vcpu) &&
kvm_x86_ops->set_hv_timer(vcpu,
- kvm_get_lapic_tscdeadline_msr(vcpu)))
+ kvm_get_lapic_target_expiration_tsc(vcpu)))
kvm_lapic_switch_to_sw_timer(vcpu);
/*
* On a host with synchronized TSC, there is no need to update
--
1.9.1

2016-10-24 10:24:26

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH v3 5/5] KVM: LAPIC: add APIC Timer periodic/oneshot mode VMX preemption timer support

From: Wanpeng Li <[email protected]>

Most windows guests still utilize APIC Timer periodic/oneshot mode
instead of tsc-deadline mode, and the APIC Timer periodic/oneshot
mode are still emulated by high overhead hrtimer on host. This patch
converts the expected expire time of the periodic/oneshot mode to
guest deadline tsc in order to leverage VMX preemption timer logic
for APIC Timer tsc-deadline mode. After each preemption timer vmexit
preemption timer is restarted to emulate LVTT current-count register
is automatically reloaded from the initial-count register when the
count reaches 0. This patch reduces ~5600 cycles for each APIC Timer
periodic mode operation virtualization.

Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Yunhong Jiang <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
arch/x86/kvm/lapic.c | 117 ++++++++++++++++++++++++++++++++++++++-------------
arch/x86/kvm/lapic.h | 1 +
2 files changed, 89 insertions(+), 29 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 0354a79..827ef5d 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1090,7 +1090,7 @@ static void apic_send_ipi(struct kvm_lapic *apic)

static u32 apic_get_tmcct(struct kvm_lapic *apic)
{
- ktime_t remaining;
+ ktime_t remaining, now;
s64 ns;
u32 tmcct;

@@ -1101,7 +1101,8 @@ static u32 apic_get_tmcct(struct kvm_lapic *apic)
apic->lapic_timer.period == 0)
return 0;

- remaining = hrtimer_get_remaining(&apic->lapic_timer.timer);
+ now = apic->lapic_timer.timer.base->get_time();
+ remaining = ktime_sub(apic->lapic_timer.target_expiration, now);
if (ktime_to_ns(remaining) < 0)
remaining = ktime_set(0, 0);

@@ -1351,13 +1352,31 @@ static void start_sw_period(struct kvm_lapic *apic)
{
ktime_t now;

- /* lapic timer in oneshot or periodic mode */
now = apic->lapic_timer.timer.base->get_time();
- apic->lapic_timer.period = (u64)kvm_lapic_get_reg(apic, APIC_TMICT)
- * APIC_BUS_CYCLE_NS * apic->divide_count;

if (!apic->lapic_timer.period)
return;
+
+ if (likely(ktime_after(apic->lapic_timer.target_expiration, now)))
+ hrtimer_start(&apic->lapic_timer.timer,
+ apic->lapic_timer.target_expiration,
+ HRTIMER_MODE_ABS_PINNED);
+ else
+ apic_timer_expired(apic);
+}
+
+static bool set_target_expiration(struct kvm_lapic *apic)
+{
+ ktime_t now;
+ u64 tscl = rdtsc();
+
+ now = apic->lapic_timer.timer.base->get_time();
+ apic->lapic_timer.period = (u64)kvm_lapic_get_reg(apic, APIC_TMICT)
+ * APIC_BUS_CYCLE_NS * apic->divide_count;
+
+ if (!apic->lapic_timer.period)
+ return false;
+
/*
* Do not allow the guest to program periodic timers with small
* interval, since the hrtimers are not throttled by the host
@@ -1376,10 +1395,6 @@ static void start_sw_period(struct kvm_lapic *apic)
}
}

- hrtimer_start(&apic->lapic_timer.timer,
- ktime_add_ns(now, apic->lapic_timer.period),
- HRTIMER_MODE_ABS_PINNED);
-
apic_debug("%s: bus cycle is %" PRId64 "ns, now 0x%016"
PRIx64 ", "
"timer initial count 0x%x, period %lldns, "
@@ -1389,6 +1404,20 @@ static void start_sw_period(struct kvm_lapic *apic)
apic->lapic_timer.period,
ktime_to_ns(ktime_add_ns(now,
apic->lapic_timer.period)));
+
+ apic->lapic_timer.tscdeadline = kvm_read_l1_tsc(apic->vcpu, tscl) +
+ nsec_to_cycles(apic->vcpu, apic->lapic_timer.period);
+ apic->lapic_timer.target_expiration = ktime_add_ns(now, apic->lapic_timer.period);
+
+ return true;
+}
+
+static void advance_periodic_target_expiration(struct kvm_lapic *apic)
+{
+ apic->lapic_timer.tscdeadline = kvm_read_l1_tsc(apic->vcpu, rdtsc()) +
+ nsec_to_cycles(apic->vcpu, apic->lapic_timer.period);
+ apic->lapic_timer.target_expiration = ktime_add_ns(apic->lapic_timer.timer.base->get_time(),
+ apic->lapic_timer.period);
}

bool kvm_lapic_hv_timer_in_use(struct kvm_vcpu *vcpu)
@@ -1406,22 +1435,12 @@ static void cancel_hv_timer(struct kvm_lapic *apic)
apic->lapic_timer.hv_timer_in_use = false;
}

-void kvm_lapic_expired_hv_timer(struct kvm_vcpu *vcpu)
-{
- struct kvm_lapic *apic = vcpu->arch.apic;
-
- WARN_ON(!apic->lapic_timer.hv_timer_in_use);
- WARN_ON(swait_active(&vcpu->wq));
- cancel_hv_timer(apic);
- apic_timer_expired(apic);
-}
-EXPORT_SYMBOL_GPL(kvm_lapic_expired_hv_timer);
-
static bool start_hv_timer(struct kvm_lapic *apic)
{
u64 tscdeadline = apic->lapic_timer.tscdeadline;

- if (atomic_read(&apic->lapic_timer.pending) ||
+ if ((atomic_read(&apic->lapic_timer.pending) &&
+ !apic_lvtt_period(apic)) ||
kvm_x86_ops->set_hv_timer(apic->vcpu, tscdeadline)) {
if (apic->lapic_timer.hv_timer_in_use)
cancel_hv_timer(apic);
@@ -1430,7 +1449,8 @@ static bool start_hv_timer(struct kvm_lapic *apic)
hrtimer_cancel(&apic->lapic_timer.timer);

/* In case the sw timer triggered in the window */
- if (atomic_read(&apic->lapic_timer.pending))
+ if (atomic_read(&apic->lapic_timer.pending) &&
+ !apic_lvtt_period(apic))
cancel_hv_timer(apic);
}
trace_kvm_hv_timer_state(apic->vcpu->vcpu_id,
@@ -1438,14 +1458,43 @@ static bool start_hv_timer(struct kvm_lapic *apic)
return apic->lapic_timer.hv_timer_in_use;
}

+void kvm_lapic_expired_hv_timer(struct kvm_vcpu *vcpu)
+{
+ struct kvm_lapic *apic = vcpu->arch.apic;
+
+ WARN_ON(!apic->lapic_timer.hv_timer_in_use);
+ WARN_ON(swait_active(&vcpu->wq));
+ cancel_hv_timer(apic);
+ apic_timer_expired(apic);
+
+ if (apic_lvtt_period(apic) && apic->lapic_timer.period) {
+ advance_periodic_target_expiration(apic);
+ if (!start_hv_timer(apic))
+ start_sw_period(apic);
+ }
+}
+EXPORT_SYMBOL_GPL(kvm_lapic_expired_hv_timer);
+
void kvm_lapic_switch_to_hv_timer(struct kvm_vcpu *vcpu)
{
struct kvm_lapic *apic = vcpu->arch.apic;

WARN_ON(apic->lapic_timer.hv_timer_in_use);

- if (apic_lvtt_tscdeadline(apic))
- start_hv_timer(apic);
+ if (apic_lvtt_period(apic)) {
+ ktime_t remaining, now;
+ u64 tscl = rdtsc();
+
+ now = apic->lapic_timer.timer.base->get_time();
+ remaining = hrtimer_get_remaining(&apic->lapic_timer.timer);
+ if (ktime_to_ns(remaining) < 0)
+ remaining = ktime_set(0, 0);
+
+ apic->lapic_timer.tscdeadline = kvm_read_l1_tsc(apic->vcpu, tscl) +
+ nsec_to_cycles(apic->vcpu, ktime_to_ns(remaining));
+ apic->lapic_timer.target_expiration = ktime_add_ns(now, ktime_to_ns(remaining));
+ }
+ start_hv_timer(apic);
}
EXPORT_SYMBOL_GPL(kvm_lapic_switch_to_hv_timer);

@@ -1462,7 +1511,10 @@ void kvm_lapic_switch_to_sw_timer(struct kvm_vcpu *vcpu)
if (atomic_read(&apic->lapic_timer.pending))
return;

- start_sw_tscdeadline(apic);
+ if (apic_lvtt_period(apic) || apic_lvtt_oneshot(apic))
+ start_sw_period(apic);
+ else if (apic_lvtt_tscdeadline(apic))
+ start_sw_tscdeadline(apic);
}
EXPORT_SYMBOL_GPL(kvm_lapic_switch_to_sw_timer);

@@ -1470,9 +1522,11 @@ static void start_apic_timer(struct kvm_lapic *apic)
{
atomic_set(&apic->lapic_timer.pending, 0);

- if (apic_lvtt_period(apic) || apic_lvtt_oneshot(apic))
- start_sw_period(apic);
- else if (apic_lvtt_tscdeadline(apic)) {
+ if (apic_lvtt_period(apic) || apic_lvtt_oneshot(apic)) {
+ if (set_target_expiration(apic) &&
+ !(kvm_x86_ops->set_hv_timer && start_hv_timer(apic)))
+ start_sw_period(apic);
+ } else if (apic_lvtt_tscdeadline(apic)) {
if (!(kvm_x86_ops->set_hv_timer && start_hv_timer(apic)))
start_sw_tscdeadline(apic);
}
@@ -1923,6 +1977,7 @@ static enum hrtimer_restart apic_timer_fn(struct hrtimer *data)
apic_timer_expired(apic);

if (lapic_is_periodic(apic)) {
+ advance_periodic_target_expiration(apic);
hrtimer_add_expires_ns(&ktimer->timer, ktimer->period);
return HRTIMER_RESTART;
} else
@@ -2005,8 +2060,12 @@ void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu)

if (atomic_read(&apic->lapic_timer.pending) > 0) {
kvm_apic_local_deliver(apic, APIC_LVTT);
- if (apic_lvtt_tscdeadline(apic))
+ if (apic_lvtt_period(apic))
+ apic->lapic_timer.tscdeadline = 0;
+ if (apic_lvtt_oneshot(apic)) {
apic->lapic_timer.tscdeadline = 0;
+ apic->lapic_timer.target_expiration = ktime_set(0, 0);
+ }
atomic_set(&apic->lapic_timer.pending, 0);
}
}
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 031db26..e0c8023 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -15,6 +15,7 @@
struct kvm_timer {
struct hrtimer timer;
s64 period; /* unit: ns */
+ ktime_t target_expiration;
u32 timer_mode;
u32 timer_mode_mask;
u64 tscdeadline;
--
1.9.1

2016-10-24 10:24:45

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH v3 4/5] KVM: LAPIC: rename start/cancel_hv_tscdeadline to start/cancel_hv_timer

From: Wanpeng Li <[email protected]>

Rename start/cancel_hv_tscdeadline to start/cancel_hv_timer since
they will handle both APIC Timer periodic/oneshot mode and tsc-deadline
mode.

Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Yunhong Jiang <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
arch/x86/kvm/lapic.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 531fc28..0354a79 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1400,7 +1400,7 @@ bool kvm_lapic_hv_timer_in_use(struct kvm_vcpu *vcpu)
}
EXPORT_SYMBOL_GPL(kvm_lapic_hv_timer_in_use);

-static void cancel_hv_tscdeadline(struct kvm_lapic *apic)
+static void cancel_hv_timer(struct kvm_lapic *apic)
{
kvm_x86_ops->cancel_hv_timer(apic->vcpu);
apic->lapic_timer.hv_timer_in_use = false;
@@ -1412,26 +1412,26 @@ void kvm_lapic_expired_hv_timer(struct kvm_vcpu *vcpu)

WARN_ON(!apic->lapic_timer.hv_timer_in_use);
WARN_ON(swait_active(&vcpu->wq));
- cancel_hv_tscdeadline(apic);
+ cancel_hv_timer(apic);
apic_timer_expired(apic);
}
EXPORT_SYMBOL_GPL(kvm_lapic_expired_hv_timer);

-static bool start_hv_tscdeadline(struct kvm_lapic *apic)
+static bool start_hv_timer(struct kvm_lapic *apic)
{
u64 tscdeadline = apic->lapic_timer.tscdeadline;

if (atomic_read(&apic->lapic_timer.pending) ||
kvm_x86_ops->set_hv_timer(apic->vcpu, tscdeadline)) {
if (apic->lapic_timer.hv_timer_in_use)
- cancel_hv_tscdeadline(apic);
+ cancel_hv_timer(apic);
} else {
apic->lapic_timer.hv_timer_in_use = true;
hrtimer_cancel(&apic->lapic_timer.timer);

/* In case the sw timer triggered in the window */
if (atomic_read(&apic->lapic_timer.pending))
- cancel_hv_tscdeadline(apic);
+ cancel_hv_timer(apic);
}
trace_kvm_hv_timer_state(apic->vcpu->vcpu_id,
apic->lapic_timer.hv_timer_in_use);
@@ -1445,7 +1445,7 @@ void kvm_lapic_switch_to_hv_timer(struct kvm_vcpu *vcpu)
WARN_ON(apic->lapic_timer.hv_timer_in_use);

if (apic_lvtt_tscdeadline(apic))
- start_hv_tscdeadline(apic);
+ start_hv_timer(apic);
}
EXPORT_SYMBOL_GPL(kvm_lapic_switch_to_hv_timer);

@@ -1457,7 +1457,7 @@ void kvm_lapic_switch_to_sw_timer(struct kvm_vcpu *vcpu)
if (!apic->lapic_timer.hv_timer_in_use)
return;

- cancel_hv_tscdeadline(apic);
+ cancel_hv_timer(apic);

if (atomic_read(&apic->lapic_timer.pending))
return;
@@ -1473,7 +1473,7 @@ static void start_apic_timer(struct kvm_lapic *apic)
if (apic_lvtt_period(apic) || apic_lvtt_oneshot(apic))
start_sw_period(apic);
else if (apic_lvtt_tscdeadline(apic)) {
- if (!(kvm_x86_ops->set_hv_timer && start_hv_tscdeadline(apic)))
+ if (!(kvm_x86_ops->set_hv_timer && start_hv_timer(apic)))
start_sw_tscdeadline(apic);
}
}
--
1.9.1

2016-10-24 10:25:27

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH v3 2/5] KVM: LAPIC: guarantee the timer is in tsc-deadline mode

From: Wanpeng Li <[email protected]>

Check apic_lvtt_tscdeadline() mode directly instead of apic_lvtt_oneshot()
and apic_lvtt_period() to guarantee the timer is in tsc-deadline mode when
rdmsr MSR_IA32_TSCDEADLINE.

Suggested-by: Radim Krčmář <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Yunhong Jiang <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
arch/x86/kvm/lapic.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index dad743e..dce6c0b 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1711,8 +1711,8 @@ u64 kvm_get_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu)
{
struct kvm_lapic *apic = vcpu->arch.apic;

- if (!lapic_in_kernel(vcpu) || apic_lvtt_oneshot(apic) ||
- apic_lvtt_period(apic))
+ if (!lapic_in_kernel(vcpu) ||
+ !apic_lvtt_tscdeadline(apic))
return 0;

return apic->lapic_timer.tscdeadline;
--
1.9.1

2016-10-24 14:50:24

by Radim Krčmář

[permalink] [raw]
Subject: Re: [PATCH v3 5/5] KVM: LAPIC: add APIC Timer periodic/oneshot mode VMX preemption timer support

2016-10-24 18:23+0800, Wanpeng Li:
> From: Wanpeng Li <[email protected]>
>
> Most windows guests still utilize APIC Timer periodic/oneshot mode
> instead of tsc-deadline mode, and the APIC Timer periodic/oneshot
> mode are still emulated by high overhead hrtimer on host. This patch
> converts the expected expire time of the periodic/oneshot mode to
> guest deadline tsc in order to leverage VMX preemption timer logic
> for APIC Timer tsc-deadline mode. After each preemption timer vmexit
> preemption timer is restarted to emulate LVTT current-count register
> is automatically reloaded from the initial-count register when the
> count reaches 0. This patch reduces ~5600 cycles for each APIC Timer
> periodic mode operation virtualization.
>
> Cc: Paolo Bonzini <[email protected]>
> Cc: Radim Krčmář <[email protected]>
> Cc: Yunhong Jiang <[email protected]>
> Signed-off-by: Wanpeng Li <[email protected]>
> ---
> arch/x86/kvm/lapic.c | 117 ++++++++++++++++++++++++++++++++++++++-------------
> arch/x86/kvm/lapic.h | 1 +
> 2 files changed, 89 insertions(+), 29 deletions(-)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> @@ -2005,8 +2060,12 @@ void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu)
>
> if (atomic_read(&apic->lapic_timer.pending) > 0) {
> kvm_apic_local_deliver(apic, APIC_LVTT);
> - if (apic_lvtt_tscdeadline(apic))
> + if (apic_lvtt_period(apic))

This should remain apic_lvtt_tscdeadline(). I can change that when
applying to kvm/queue.

> + apic->lapic_timer.tscdeadline = 0;
> + if (apic_lvtt_oneshot(apic)) {
> apic->lapic_timer.tscdeadline = 0;
> + apic->lapic_timer.target_expiration = ktime_set(0, 0);
> + }
> atomic_set(&apic->lapic_timer.pending, 0);
> }
> }

2016-10-24 15:03:28

by Radim Krčmář

[permalink] [raw]
Subject: [PATCH 6/5] KVM: x86: fix periodic lapic timer with hrtimers

I have only compile-tested it, but it should optimize the switch and
also fix two bugs. The first one is major.
This needs that the deadline clearing in [5/5] is fixed.
---8<---
We must start the hrimer even if the expiration is already in the past,
otherwise the periodic timer would not rearm the hrtimer.

And computing next expiration of a period timer does not require current
time. The period should be constant, so it is more precise to add the
period to the last expiration time. This fixes a time difference
between hrtimer expiration and apic->lapic_timer.target_expiration.

Signed-off-by: Radim Krčmář <[email protected]>
---
arch/x86/kvm/lapic.c | 38 +++++++++++++-------------------------
1 file changed, 13 insertions(+), 25 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index f4734fe12803..6244988418be 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1350,19 +1350,19 @@ static void start_sw_tscdeadline(struct kvm_lapic *apic)

static void start_sw_period(struct kvm_lapic *apic)
{
- ktime_t now;
-
- now = apic->lapic_timer.timer.base->get_time();
-
if (!apic->lapic_timer.period)
return;

- if (likely(ktime_after(apic->lapic_timer.target_expiration, now)))
- hrtimer_start(&apic->lapic_timer.timer,
- apic->lapic_timer.target_expiration,
- HRTIMER_MODE_ABS_PINNED);
- else
+ if (apic_lvtt_oneshot(apic) &&
+ ktime_after(apic->lapic_timer.target_expiration,
+ apic->lapic_timer.timer.base->get_time())) {
apic_timer_expired(apic);
+ return;
+ }
+
+ hrtimer_start(&apic->lapic_timer.timer,
+ apic->lapic_timer.target_expiration,
+ HRTIMER_MODE_ABS_PINNED);
}

static bool set_target_expiration(struct kvm_lapic *apic)
@@ -1414,10 +1414,11 @@ static bool set_target_expiration(struct kvm_lapic *apic)

static void advance_periodic_target_expiration(struct kvm_lapic *apic)
{
- apic->lapic_timer.tscdeadline = kvm_read_l1_tsc(apic->vcpu, rdtsc()) +
+ apic->lapic_timer.tscdeadline +=
nsec_to_cycles(apic->vcpu, apic->lapic_timer.period);
- apic->lapic_timer.target_expiration = ktime_add_ns(apic->lapic_timer.timer.base->get_time(),
- apic->lapic_timer.period);
+ apic->lapic_timer.target_expiration =
+ ktime_add_ns(apic->lapic_timer.target_expiration,
+ apic->lapic_timer.period);
}

bool kvm_lapic_hv_timer_in_use(struct kvm_vcpu *vcpu)
@@ -1481,19 +1482,6 @@ void kvm_lapic_switch_to_hv_timer(struct kvm_vcpu *vcpu)

WARN_ON(apic->lapic_timer.hv_timer_in_use);

- if (apic_lvtt_period(apic)) {
- ktime_t remaining, now;
- u64 tscl = rdtsc();
-
- now = apic->lapic_timer.timer.base->get_time();
- remaining = hrtimer_get_remaining(&apic->lapic_timer.timer);
- if (ktime_to_ns(remaining) < 0)
- remaining = ktime_set(0, 0);
-
- apic->lapic_timer.tscdeadline = kvm_read_l1_tsc(apic->vcpu, tscl) +
- nsec_to_cycles(apic->vcpu, ktime_to_ns(remaining));
- apic->lapic_timer.target_expiration = ktime_add_ns(now, ktime_to_ns(remaining));
- }
start_hv_timer(apic);
}
EXPORT_SYMBOL_GPL(kvm_lapic_switch_to_hv_timer);
--
2.10.1

2016-10-24 15:09:10

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 6/5] KVM: x86: fix periodic lapic timer with hrtimers



On 24/10/2016 17:03, Radim Krčmář wrote:
> I have only compile-tested it, but it should optimize the switch and
> also fix two bugs. The first one is major.
> This needs that the deadline clearing in [5/5] is fixed.
> ---8<---
> We must start the hrimer even if the expiration is already in the past,
> otherwise the periodic timer would not rearm the hrtimer.
>
> And computing next expiration of a period timer does not require current
> time. The period should be constant, so it is more precise to add the
> period to the last expiration time. This fixes a time difference
> between hrtimer expiration and apic->lapic_timer.target_expiration.
>
> Signed-off-by: Radim Krčmář <[email protected]>
> ---
> arch/x86/kvm/lapic.c | 38 +++++++++++++-------------------------
> 1 file changed, 13 insertions(+), 25 deletions(-)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index f4734fe12803..6244988418be 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1350,19 +1350,19 @@ static void start_sw_tscdeadline(struct kvm_lapic *apic)
>
> static void start_sw_period(struct kvm_lapic *apic)
> {
> - ktime_t now;
> -
> - now = apic->lapic_timer.timer.base->get_time();
> -
> if (!apic->lapic_timer.period)
> return;
>
> - if (likely(ktime_after(apic->lapic_timer.target_expiration, now)))
> - hrtimer_start(&apic->lapic_timer.timer,
> - apic->lapic_timer.target_expiration,
> - HRTIMER_MODE_ABS_PINNED);
> - else
> + if (apic_lvtt_oneshot(apic) &&
> + ktime_after(apic->lapic_timer.target_expiration,
> + apic->lapic_timer.timer.base->get_time())) {
> apic_timer_expired(apic);
> + return;
> + }
> +
> + hrtimer_start(&apic->lapic_timer.timer,
> + apic->lapic_timer.target_expiration,
> + HRTIMER_MODE_ABS_PINNED);
> }
>
> static bool set_target_expiration(struct kvm_lapic *apic)
> @@ -1414,10 +1414,11 @@ static bool set_target_expiration(struct kvm_lapic *apic)
>
> static void advance_periodic_target_expiration(struct kvm_lapic *apic)
> {
> - apic->lapic_timer.tscdeadline = kvm_read_l1_tsc(apic->vcpu, rdtsc()) +
> + apic->lapic_timer.tscdeadline +=
> nsec_to_cycles(apic->vcpu, apic->lapic_timer.period);
> - apic->lapic_timer.target_expiration = ktime_add_ns(apic->lapic_timer.timer.base->get_time(),
> - apic->lapic_timer.period);
> + apic->lapic_timer.target_expiration =
> + ktime_add_ns(apic->lapic_timer.target_expiration,
> + apic->lapic_timer.period);
> }
>
> bool kvm_lapic_hv_timer_in_use(struct kvm_vcpu *vcpu)
> @@ -1481,19 +1482,6 @@ void kvm_lapic_switch_to_hv_timer(struct kvm_vcpu *vcpu)
>
> WARN_ON(apic->lapic_timer.hv_timer_in_use);
>
> - if (apic_lvtt_period(apic)) {
> - ktime_t remaining, now;
> - u64 tscl = rdtsc();
> -
> - now = apic->lapic_timer.timer.base->get_time();
> - remaining = hrtimer_get_remaining(&apic->lapic_timer.timer);
> - if (ktime_to_ns(remaining) < 0)
> - remaining = ktime_set(0, 0);
> -
> - apic->lapic_timer.tscdeadline = kvm_read_l1_tsc(apic->vcpu, tscl) +
> - nsec_to_cycles(apic->vcpu, ktime_to_ns(remaining));
> - apic->lapic_timer.target_expiration = ktime_add_ns(now, ktime_to_ns(remaining));
> - }
> start_hv_timer(apic);
> }
> EXPORT_SYMBOL_GPL(kvm_lapic_switch_to_hv_timer);
>

Reviewed-by: Paolo Bonzini <[email protected]>

Go ahead, squash it into 5/5 and commit to kvm/queue. :)

Paolo

2016-10-24 15:27:46

by Radim Krčmář

[permalink] [raw]
Subject: Re: [PATCH 6/5] KVM: x86: fix periodic lapic timer with hrtimers

2016-10-24 17:09+0200, Paolo Bonzini:
> On 24/10/2016 17:03, Radim Krčmář wrote:
[...]
>
> Reviewed-by: Paolo Bonzini <[email protected]>
>
> Go ahead, squash it into 5/5 and commit to kvm/queue. :)

Did that, thanks.

Wanpeng, the code is now under your name so please check it and/or
complain.

2016-10-24 23:33:06

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH v3 5/5] KVM: LAPIC: add APIC Timer periodic/oneshot mode VMX preemption timer support

2016-10-24 22:50 GMT+08:00 Radim Krčmář <[email protected]>:
> 2016-10-24 18:23+0800, Wanpeng Li:
>> From: Wanpeng Li <[email protected]>
>>
>> Most windows guests still utilize APIC Timer periodic/oneshot mode
>> instead of tsc-deadline mode, and the APIC Timer periodic/oneshot
>> mode are still emulated by high overhead hrtimer on host. This patch
>> converts the expected expire time of the periodic/oneshot mode to
>> guest deadline tsc in order to leverage VMX preemption timer logic
>> for APIC Timer tsc-deadline mode. After each preemption timer vmexit
>> preemption timer is restarted to emulate LVTT current-count register
>> is automatically reloaded from the initial-count register when the
>> count reaches 0. This patch reduces ~5600 cycles for each APIC Timer
>> periodic mode operation virtualization.
>>
>> Cc: Paolo Bonzini <[email protected]>
>> Cc: Radim Krčmář <[email protected]>
>> Cc: Yunhong Jiang <[email protected]>
>> Signed-off-by: Wanpeng Li <[email protected]>
>> ---
>> arch/x86/kvm/lapic.c | 117 ++++++++++++++++++++++++++++++++++++++-------------
>> arch/x86/kvm/lapic.h | 1 +
>> 2 files changed, 89 insertions(+), 29 deletions(-)
>>
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> @@ -2005,8 +2060,12 @@ void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu)
>>
>> if (atomic_read(&apic->lapic_timer.pending) > 0) {
>> kvm_apic_local_deliver(apic, APIC_LVTT);
>> - if (apic_lvtt_tscdeadline(apic))
>> + if (apic_lvtt_period(apic))
>
> This should remain apic_lvtt_tscdeadline(). I can change that when
> applying to kvm/queue.

Thanks Radim, I make a mistake here.

Regards,
Wanpeng Li

>
>> + apic->lapic_timer.tscdeadline = 0;
>> + if (apic_lvtt_oneshot(apic)) {
>> apic->lapic_timer.tscdeadline = 0;
>> + apic->lapic_timer.target_expiration = ktime_set(0, 0);
>> + }
>> atomic_set(&apic->lapic_timer.pending, 0);
>> }
>> }

2016-10-24 23:39:18

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH 6/5] KVM: x86: fix periodic lapic timer with hrtimers

2016-10-24 23:27 GMT+08:00 Radim Krčmář <[email protected]>:
> 2016-10-24 17:09+0200, Paolo Bonzini:
>> On 24/10/2016 17:03, Radim Krčmář wrote:
> [...]
>>
>> Reviewed-by: Paolo Bonzini <[email protected]>
>>
>> Go ahead, squash it into 5/5 and commit to kvm/queue. :)
>
> Did that, thanks.
>
> Wanpeng, the code is now under your name so please check it and/or
> complain.

This patch 6/5 incurred regressions.

- The latency of the periodic mode which is emulated by VMX preemption
is almost the same as periodic mode which is emulated by hrtimer.
- The oneshot mode test of kvm-unit-tests/apic_timer_latency.flat almost fail.

Btw, hope you can also apply the testcase for kvm-unit-tests. :)

Regards,
Wanpeng Li

2016-10-25 11:43:41

by Radim Krčmář

[permalink] [raw]
Subject: Re: [PATCH 6/5] KVM: x86: fix periodic lapic timer with hrtimers

2016-10-25 07:39+0800, Wanpeng Li:
> 2016-10-24 23:27 GMT+08:00 Radim Krčmář <[email protected]>:
>> 2016-10-24 17:09+0200, Paolo Bonzini:
>>> On 24/10/2016 17:03, Radim Krčmář wrote:
>> [...]
>>>
>>> Reviewed-by: Paolo Bonzini <[email protected]>
>>>
>>> Go ahead, squash it into 5/5 and commit to kvm/queue. :)
>>
>> Did that, thanks.
>>
>> Wanpeng, the code is now under your name so please check it and/or
>> complain.
>
> This patch 6/5 incurred regressions.
>
> - The latency of the periodic mode which is emulated by VMX preemption
> is almost the same as periodic mode which is emulated by hrtimer.

Hm, what numbers are you getting?

When I ran the test with the original series, then it actually had worse
results with the VMX preemption than it did with the hrimer:

hlt average latency = 1464151
pause average latency = 1467605

htl tests the hrtimer, pause tests the VMX preemption. I just replaced
"hlt" with "pause" in the assembly loop.

The worse result was because the VMX preemption period was computed
incorrectly -- it was being added to now(). Some time passes between
the expiration and reading of now(), so this time was extending the
period while it shouldn't have.

If I run the test with [6/5], it gets sane numbers:

hlt average latency = 1465107
pause average latency = 1465093

The numbers are sane bacause the test is not computing latency (= how
long after the timer should have fired have we received the interrupt)
-- it is computing the duration of the period in cycles, which is much
better right now.

> - The oneshot mode test of kvm-unit-tests/apic_timer_latency.flat almost fail.

Oops, silly mistake -- apic_timer_expired() was in the 'else' branch in
[5/5] and I didn't invert the condition after moving it.

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 6244988418be..d7e74c8ec8ca 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1354,8 +1354,8 @@ static void start_sw_period(struct kvm_lapic *apic)
return;

if (apic_lvtt_oneshot(apic) &&
- ktime_after(apic->lapic_timer.target_expiration,
- apic->lapic_timer.timer.base->get_time())) {
+ !ktime_after(apic->lapic_timer.target_expiration,
+ apic->lapic_timer.timer.base->get_time())) {
apic_timer_expired(apic);
return;
}

Paolo, can you squash that?

> Btw, hope you can also apply the testcase for kvm-unit-tests. :)

I will have some comments, because it would be nicer if it measured the
latency ... expected_expiration is not computed correctly.

2016-10-25 11:56:05

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 6/5] KVM: x86: fix periodic lapic timer with hrtimers



On 25/10/2016 13:43, Radim Krčmář wrote:
> Oops, silly mistake -- apic_timer_expired() was in the 'else' branch in
> [5/5] and I didn't invert the condition after moving it.
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 6244988418be..d7e74c8ec8ca 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1354,8 +1354,8 @@ static void start_sw_period(struct kvm_lapic *apic)
> return;
>
> if (apic_lvtt_oneshot(apic) &&
> - ktime_after(apic->lapic_timer.target_expiration,
> - apic->lapic_timer.timer.base->get_time())) {
> + !ktime_after(apic->lapic_timer.target_expiration,
> + apic->lapic_timer.timer.base->get_time())) {
> apic_timer_expired(apic);
> return;
> }
>
> Paolo, can you squash that?

Yes, will do.

Paolo

2016-10-25 13:04:02

by Radim Krčmář

[permalink] [raw]
Subject: Re: [PATCH 6/5] KVM: x86: fix periodic lapic timer with hrtimers

2016-10-25 12:48+0000, Wanpeng Li:
>> 在 2016年10月24日,下午11:03,Radim Krčmář <[email protected]> 写道:
>>
>> I have only compile-tested it, but it should optimize the switch and
>> also fix two bugs. The first one is major.
>> This needs that the deadline clearing in [5/5] is fixed.
>> ---8<---
>> We must start the hrimer even if the expiration is already in the past,
>> otherwise the periodic timer would not rearm the hrtimer.
>>
>> And computing next expiration of a period timer does not require current
>> time. The period should be constant, so it is more precise to add the
>> period to the last expiration time. This fixes a time difference
>> between hrtimer expiration and apic->lapic_timer.target_expiration.
>>
>> Signed-off-by: Radim Krčmář <[email protected]>
>> ---
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> @@ -1481,19 +1482,6 @@ void kvm_lapic_switch_to_hv_timer(struct kvm_vcpu *vcpu)
>>
>> WARN_ON(apic->lapic_timer.hv_timer_in_use);
>>
>> - if (apic_lvtt_period(apic)) {
>> - ktime_t remaining, now;
>> - u64 tscl = rdtsc();
>> -
>> - now = apic->lapic_timer.timer.base->get_time();
>> - remaining = hrtimer_get_remaining(&apic->lapic_timer.timer);
>> - if (ktime_to_ns(remaining) < 0)
>> - remaining = ktime_set(0, 0);
>> -
>> - apic->lapic_timer.tscdeadline = kvm_read_l1_tsc(apic->vcpu, tscl) +
>> - nsec_to_cycles(apic->vcpu, ktime_to_ns(remaining));
>> - apic->lapic_timer.target_expiration = ktime_add_ns(now, ktime_to_ns(remaining));
>> - }
>
> Why you remove this?

We computed those values in advance_periodic_target_expiration() and can
re-use them here.

hrtimer expiration == apic->lapic_timer.target_expiration. (Otherwise
we're doing something wrong.) If that holds then the code does

target_expiration = now + (target_expiration - now)

Which can be optimized to

target_expiration = target_expiration

and to nothing. The same principle holds for
apic->lapic_timer.tscdeadline as well.

In other words: It doesn't matter if the timer switches between hrtimer
and VMX deadline -- the target expiration is still the same.

This hunk only added imprecision, because kvm_read_l1_tsc() and
ktime_to_ns() were not using the same time for computation.

2016-10-26 06:02:43

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH 6/5] KVM: x86: fix periodic lapic timer with hrtimers

2016-10-25 19:43 GMT+08:00 Radim Krčmář <[email protected]>:
> 2016-10-25 07:39+0800, Wanpeng Li:
>> 2016-10-24 23:27 GMT+08:00 Radim Krčmář <[email protected]>:
>>> 2016-10-24 17:09+0200, Paolo Bonzini:
>>>> On 24/10/2016 17:03, Radim Krčmář wrote:
>>> [...]
>>>>
>>>> Reviewed-by: Paolo Bonzini <[email protected]>
>>>>
>>>> Go ahead, squash it into 5/5 and commit to kvm/queue. :)
>>>
>>> Did that, thanks.
>>>
>>> Wanpeng, the code is now under your name so please check it and/or
>>> complain.
>>
>> This patch 6/5 incurred regressions.
>>
>> - The latency of the periodic mode which is emulated by VMX preemption
>> is almost the same as periodic mode which is emulated by hrtimer.
>
> Hm, what numbers are you getting?

The two fixes look good to me. However, the codes which you remove in
kvm_lapic_switch_to_hv_timer() results in different numbers.

w/o remove hlt average latency = 2398462
w/ remove hlt average latency = 2403845

>
> When I ran the test with the original series, then it actually had worse

Did you test this by running my kvm-unit-tests/apic_timer_latency.flat?

> results with the VMX preemption than it did with the hrimer:
>
> hlt average latency = 1464151
> pause average latency = 1467605
>
> htl tests the hrtimer, pause tests the VMX preemption. I just replaced
> "hlt" with "pause" in the assembly loop.
>
> The worse result was because the VMX preemption period was computed
> incorrectly -- it was being added to now(). Some time passes between
> the expiration and reading of now(), so this time was extending the
> period while it shouldn't have.
>
> If I run the test with [6/5], it gets sane numbers:
>
> hlt average latency = 1465107
> pause average latency = 1465093
>
> The numbers are sane bacause the test is not computing latency (= how
> long after the timer should have fired have we received the interrupt)
> -- it is computing the duration of the period in cycles, which is much
> better right now.

Agreed.

>
>> - The oneshot mode test of kvm-unit-tests/apic_timer_latency.flat almost fail.
>
> Oops, silly mistake -- apic_timer_expired() was in the 'else' branch in
> [5/5] and I didn't invert the condition after moving it.
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 6244988418be..d7e74c8ec8ca 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1354,8 +1354,8 @@ static void start_sw_period(struct kvm_lapic *apic)
> return;
>
> if (apic_lvtt_oneshot(apic) &&
> - ktime_after(apic->lapic_timer.target_expiration,
> - apic->lapic_timer.timer.base->get_time())) {
> + !ktime_after(apic->lapic_timer.target_expiration,
> + apic->lapic_timer.timer.base->get_time())) {
> apic_timer_expired(apic);
> return;
> }
>

It works.

> Paolo, can you squash that?
>
>> Btw, hope you can also apply the testcase for kvm-unit-tests. :)
>
> I will have some comments, because it would be nicer if it measured the
> latency ... expected_expiration is not computed correctly.

It measured the latency from guest programs the clock event device to
interrupt injected to guest after timer fire.

Regards,
Wanpeng Li

2016-10-26 06:08:22

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH 6/5] KVM: x86: fix periodic lapic timer with hrtimers

2016-10-26 14:02 GMT+08:00 Wanpeng Li <[email protected]>:
> 2016-10-25 19:43 GMT+08:00 Radim Krčmář <[email protected]>:
>> 2016-10-25 07:39+0800, Wanpeng Li:
>>> 2016-10-24 23:27 GMT+08:00 Radim Krčmář <[email protected]>:
>>>> 2016-10-24 17:09+0200, Paolo Bonzini:
>>>>> On 24/10/2016 17:03, Radim Krčmář wrote:
>>>> [...]
>>>>>
>>>>> Reviewed-by: Paolo Bonzini <[email protected]>
>>>>>
>>>>> Go ahead, squash it into 5/5 and commit to kvm/queue. :)
>>>>
>>>> Did that, thanks.
>>>>
>>>> Wanpeng, the code is now under your name so please check it and/or
>>>> complain.
>>>
>>> This patch 6/5 incurred regressions.
>>>
>>> - The latency of the periodic mode which is emulated by VMX preemption
>>> is almost the same as periodic mode which is emulated by hrtimer.
>>
>> Hm, what numbers are you getting?
>
> The two fixes look good to me. However, the codes which you remove in
> kvm_lapic_switch_to_hv_timer() results in different numbers.
>
> w/o remove hlt average latency = 2398462
> w/ remove hlt average latency = 2403845
>
>>
>> When I ran the test with the original series, then it actually had worse
>
> Did you test this by running my kvm-unit-tests/apic_timer_latency.flat?
>
>> results with the VMX preemption than it did with the hrimer:
>>
>> hlt average latency = 1464151
>> pause average latency = 1467605
>>
>> htl tests the hrtimer, pause tests the VMX preemption. I just replaced
>> "hlt" with "pause" in the assembly loop.
>>
>> The worse result was because the VMX preemption period was computed
>> incorrectly -- it was being added to now(). Some time passes between
>> the expiration and reading of now(), so this time was extending the
>> period while it shouldn't have.
>>
>> If I run the test with [6/5], it gets sane numbers:
>>
>> hlt average latency = 1465107
>> pause average latency = 1465093
>>
>> The numbers are sane bacause the test is not computing latency (= how
>> long after the timer should have fired have we received the interrupt)
>> -- it is computing the duration of the period in cycles, which is much
>> better right now.
>
> Agreed.
>
>>
>>> - The oneshot mode test of kvm-unit-tests/apic_timer_latency.flat almost fail.
>>
>> Oops, silly mistake -- apic_timer_expired() was in the 'else' branch in
>> [5/5] and I didn't invert the condition after moving it.
>>
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index 6244988418be..d7e74c8ec8ca 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -1354,8 +1354,8 @@ static void start_sw_period(struct kvm_lapic *apic)
>> return;
>>
>> if (apic_lvtt_oneshot(apic) &&
>> - ktime_after(apic->lapic_timer.target_expiration,
>> - apic->lapic_timer.timer.base->get_time())) {
>> + !ktime_after(apic->lapic_timer.target_expiration,
>> + apic->lapic_timer.timer.base->get_time())) {
>> apic_timer_expired(apic);
>> return;
>> }
>>
>
> It works.
>
>> Paolo, can you squash that?
>>
>>> Btw, hope you can also apply the testcase for kvm-unit-tests. :)
>>
>> I will have some comments, because it would be nicer if it measured the
>> latency ... expected_expiration is not computed correctly.
>
> It measured the latency from guest programs the clock event device to
> interrupt injected to guest after timer fire.

When compare this with clock event device which is emulated by
hrtimer, we can calculate the latency bonus from VMX preemption.

Regards,
Wanpeng Li

2016-10-26 10:24:09

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH 6/5] KVM: x86: fix periodic lapic timer with hrtimers

2016-10-25 19:43 GMT+08:00 Radim Krčmář <[email protected]>:
> 2016-10-25 07:39+0800, Wanpeng Li:
>> 2016-10-24 23:27 GMT+08:00 Radim Krčmář <[email protected]>:
>>> 2016-10-24 17:09+0200, Paolo Bonzini:
>>>> On 24/10/2016 17:03, Radim Krčmář wrote:
>>> [...]
>>>>
>>>> Reviewed-by: Paolo Bonzini <[email protected]>
>>>>
>>>> Go ahead, squash it into 5/5 and commit to kvm/queue. :)
>>>
>>> Did that, thanks.
>>>
>>> Wanpeng, the code is now under your name so please check it and/or
>>> complain.
>>
>> This patch 6/5 incurred regressions.
>>
>> - The latency of the periodic mode which is emulated by VMX preemption
>> is almost the same as periodic mode which is emulated by hrtimer.
>
> Hm, what numbers are you getting?
>
> When I ran the test with the original series, then it actually had worse
> results with the VMX preemption than it did with the hrimer:
>
> hlt average latency = 1464151
> pause average latency = 1467605
>
> htl tests the hrtimer, pause tests the VMX preemption. I just replaced
> "hlt" with "pause" in the assembly loop.
>
> The worse result was because the VMX preemption period was computed
> incorrectly -- it was being added to now(). Some time passes between
> the expiration and reading of now(), so this time was extending the
> period while it shouldn't have.
>
> If I run the test with [6/5], it gets sane numbers:
>
> hlt average latency = 1465107
> pause average latency = 1465093
>
> The numbers are sane bacause the test is not computing latency (= how
> long after the timer should have fired have we received the interrupt)
> -- it is computing the duration of the period in cycles, which is much
> better right now.
>
>> - The oneshot mode test of kvm-unit-tests/apic_timer_latency.flat almost fail.
>
> Oops, silly mistake -- apic_timer_expired() was in the 'else' branch in
> [5/5] and I didn't invert the condition after moving it.
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 6244988418be..d7e74c8ec8ca 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1354,8 +1354,8 @@ static void start_sw_period(struct kvm_lapic *apic)
> return;
>
> if (apic_lvtt_oneshot(apic) &&
> - ktime_after(apic->lapic_timer.target_expiration,
> - apic->lapic_timer.timer.base->get_time())) {
> + !ktime_after(apic->lapic_timer.target_expiration,
> + apic->lapic_timer.timer.base->get_time())) {
> apic_timer_expired(apic);
> return;
> }
>
> Paolo, can you squash that?

It seems that squash is impossible since the dependency of current
kvm/queue(KVM: x86: use ktime_get instead of seeking the
hrtimer_clock_base), I will send out a separate patch to fix this.

Regards,
Wanpeng Li

2016-10-26 11:18:03

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 6/5] KVM: x86: fix periodic lapic timer with hrtimers



On 26/10/2016 12:23, Wanpeng Li wrote:
>> >
>> > if (apic_lvtt_oneshot(apic) &&
>> > - ktime_after(apic->lapic_timer.target_expiration,
>> > - apic->lapic_timer.timer.base->get_time())) {
>> > + !ktime_after(apic->lapic_timer.target_expiration,
>> > + apic->lapic_timer.timer.base->get_time())) {
>> > apic_timer_expired(apic);
>> > return;
>> > }
>> >
>> > Paolo, can you squash that?
> It seems that squash is impossible since the dependency of current
> kvm/queue(KVM: x86: use ktime_get instead of seeking the
> hrtimer_clock_base), I will send out a separate patch to fix this.

It is squashed in:

+ if (apic_lvtt_oneshot(apic) &&
+ ktime_after(apic->lapic_timer.timer.base->get_time(),
+ apic->lapic_timer.target_expiration)) {
+ apic_timer_expired(apic);
+ return;
+ }

Notice that the order of the arguments is inverted (alternatively I
could have used ktime_before).

Paolo

2016-10-26 11:26:24

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH 6/5] KVM: x86: fix periodic lapic timer with hrtimers

2016-10-26 19:15 GMT+08:00 Paolo Bonzini <[email protected]>:
>
>
> On 26/10/2016 12:23, Wanpeng Li wrote:
>>> >
>>> > if (apic_lvtt_oneshot(apic) &&
>>> > - ktime_after(apic->lapic_timer.target_expiration,
>>> > - apic->lapic_timer.timer.base->get_time())) {
>>> > + !ktime_after(apic->lapic_timer.target_expiration,
>>> > + apic->lapic_timer.timer.base->get_time())) {
>>> > apic_timer_expired(apic);
>>> > return;
>>> > }
>>> >
>>> > Paolo, can you squash that?
>> It seems that squash is impossible since the dependency of current
>> kvm/queue(KVM: x86: use ktime_get instead of seeking the
>> hrtimer_clock_base), I will send out a separate patch to fix this.
>
> It is squashed in:
>
> + if (apic_lvtt_oneshot(apic) &&
> + ktime_after(apic->lapic_timer.timer.base->get_time(),
> + apic->lapic_timer.target_expiration)) {
> + apic_timer_expired(apic);
> + return;
> + }
>
> Notice that the order of the arguments is inverted (alternatively I
> could have used ktime_before).

Indeed.

Regards,
Wanpeng Li

2016-10-26 13:32:41

by Radim Krčmář

[permalink] [raw]
Subject: Re: [PATCH 6/5] KVM: x86: fix periodic lapic timer with hrtimers

2016-10-26 14:02+0800, Wanpeng Li:
> 2016-10-25 19:43 GMT+08:00 Radim Krčmář <[email protected]>:
>> 2016-10-25 07:39+0800, Wanpeng Li:
>>> 2016-10-24 23:27 GMT+08:00 Radim Krčmář <[email protected]>:
>>>> 2016-10-24 17:09+0200, Paolo Bonzini:
>>>>> On 24/10/2016 17:03, Radim Krčmář wrote:
>>>> [...]
>>>>>
>>>>> Reviewed-by: Paolo Bonzini <[email protected]>
>>>>>
>>>>> Go ahead, squash it into 5/5 and commit to kvm/queue. :)
>>>>
>>>> Did that, thanks.
>>>>
>>>> Wanpeng, the code is now under your name so please check it and/or
>>>> complain.
>>>
>>> This patch 6/5 incurred regressions.
>>>
>>> - The latency of the periodic mode which is emulated by VMX preemption
>>> is almost the same as periodic mode which is emulated by hrtimer.
>>
>> Hm, what numbers are you getting?
>
> The two fixes look good to me. However, the codes which you remove in
> kvm_lapic_switch_to_hv_timer() results in different numbers.

Which of those two results is closer to the expected duration of the
period?

> w/o remove hlt average latency = 2398462
> w/ remove hlt average latency = 2403845

Some increase is expected when removing the code, because
kvm_lapic_switch_to_hv_timer() decreased the period by mistake:
it called

now = get_time()

first and then did

remaining = target - get_time() // = hrtimer_get_remaining()

but some time has passed in between calls of get_time(), let's call the
time that passed in between as "delta", so when the function later set
the new target,

new_target = now + remaining // = now + target - (now + delta)

the new_target was "delta" earlier.

5k cycles is a huge difference, though ...
You tested the original kvm_lapic_switch_to_hv_timer(), with fixed
advance_periodic_target_expiration()?

>> When I ran the test with the original series, then it actually had worse
>
> Did you test this by running my kvm-unit-tests/apic_timer_latency.flat?

Yes, I used numbers from Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz,
which had TSC calibrated to 2397.223 MHz, so the expected "average
latency" with with the default 0x100000 ns period was

0x100000 * 2.397223 - 0x100000 = 1465094.5044479999

The expected value is pretty close to what I actually measured:

>> [...]
>> If I run the test with [6/5], it gets sane numbers:
>>
>> hlt average latency = 1465107
>> pause average latency = 1465093

2016-10-26 14:01:19

by Radim Krčmář

[permalink] [raw]
Subject: Re: [PATCH 6/5] KVM: x86: fix periodic lapic timer with hrtimers

2016-10-26 14:08+0800, Wanpeng Li:
> 2016-10-26 14:02 GMT+08:00 Wanpeng Li <[email protected]>:
>> 2016-10-25 19:43 GMT+08:00 Radim Krčmář <[email protected]>:
>>> I will have some comments, because it would be nicer if it measured the
>>> latency ... expected_expiration is not computed correctly.
>>
>> It measured the latency from guest programs the clock event device to
>> interrupt injected to guest after timer fire.

No. It never computed the time when the timer fires, the test measured
the duration of the period.

Imagine that the dashed line below is a timeline. Pipe is idealized
firing of the periodic timer and caret is the time when the guest read
time in the interrupt. The number below caret is the latency.

The period is 7.

--------------------------------------------
| | | | | | |
^ ^ ^ ^ ^ ^ ^
1 2 3 1 2 1 1

The test would report "latencies" as:

1 1 1 -2 1 -1 0

because it used now() + period to compute the next expected expiration

Similarly in this case,
--------------------------------------------
| | | | | | |
^ ^ ^ ^ ^ ^
6 6 6 6 6 6

The latency is always 6, but the test would report

6 0 0 0 0 0

And if we improved the latency by 1, you'd only see the difference in
the first number. The test measured the duration of the period.

> When compare this with clock event device which is emulated by
> hrtimer, we can calculate the latency bonus from VMX preemption.

If we know when the timer should have fired, then we can measure the
latency:

latency = now() - expected_expiration

The hard part is computing expected_expiration() -- it *cannot* be
precise with one-shot or periodic APIC timer, because we don't send
expected_expiration to KVM, but only a delta and KVM sets
expected_expiration based on the delta and some random time when it gets
to set the expected_expiration.

The guest could do

before = now()
set_apic_timer(delta)
after = now()

to get some bounds on the expected expiration -- it would be between
"before + delta" and "after + delta".

2016-10-27 02:11:33

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH 6/5] KVM: x86: fix periodic lapic timer with hrtimers

2016-10-26 21:32 GMT+08:00 Radim Krčmář <[email protected]>:
> 2016-10-26 14:02+0800, Wanpeng Li:
>> 2016-10-25 19:43 GMT+08:00 Radim Krčmář <[email protected]>:
>>> 2016-10-25 07:39+0800, Wanpeng Li:
>>>> 2016-10-24 23:27 GMT+08:00 Radim Krčmář <[email protected]>:
>>>>> 2016-10-24 17:09+0200, Paolo Bonzini:
>>>>>> On 24/10/2016 17:03, Radim Krčmář wrote:
>>>>> [...]
>>>>>>
>>>>>> Reviewed-by: Paolo Bonzini <[email protected]>
>>>>>>
>>>>>> Go ahead, squash it into 5/5 and commit to kvm/queue. :)
>>>>>
>>>>> Did that, thanks.
>>>>>
>>>>> Wanpeng, the code is now under your name so please check it and/or
>>>>> complain.
>>>>
>>>> This patch 6/5 incurred regressions.
>>>>
>>>> - The latency of the periodic mode which is emulated by VMX preemption
>>>> is almost the same as periodic mode which is emulated by hrtimer.
>>>
>>> Hm, what numbers are you getting?
>>
>> The two fixes look good to me. However, the codes which you remove in
>> kvm_lapic_switch_to_hv_timer() results in different numbers.
>
> Which of those two results is closer to the expected duration of the
> period?

The result of w/ remove is more closer to the expected duration.

>
>> w/o remove hlt average latency = 2398462
>> w/ remove hlt average latency = 2403845
>
> Some increase is expected when removing the code, because
> kvm_lapic_switch_to_hv_timer() decreased the period by mistake:
> it called
>
> now = get_time()
>
> first and then did
>
> remaining = target - get_time() // = hrtimer_get_remaining()
>
> but some time has passed in between calls of get_time(), let's call the
> time that passed in between as "delta", so when the function later set
> the new target,
>
> new_target = now + remaining // = now + target - (now + delta)
>
> the new_target was "delta" earlier.

Agreed.

>
> 5k cycles is a huge difference, though ...

Yeah, delta can't be as large as 5k cycles.

> You tested the original kvm_lapic_switch_to_hv_timer(), with fixed
> advance_periodic_target_expiration()?

Yes.

>
>>> When I ran the test with the original series, then it actually had worse
>>
>> Did you test this by running my kvm-unit-tests/apic_timer_latency.flat?
>
> Yes, I used numbers from Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz,
> which had TSC calibrated to 2397.223 MHz, so the expected "average
> latency" with with the default 0x100000 ns period was
>
> 0x100000 * 2.397223 - 0x100000 = 1465094.5044479999

I agree with your remove the logic in kvm_lapic_switch_to_hv_timer()
since it is more closer to the expected "average latency" now.

Regards,
Wanpeng Li

2016-10-27 02:33:55

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH 6/5] KVM: x86: fix periodic lapic timer with hrtimers

2016-10-26 22:01 GMT+08:00 Radim Krčmář <[email protected]>:
> 2016-10-26 14:08+0800, Wanpeng Li:
>> 2016-10-26 14:02 GMT+08:00 Wanpeng Li <[email protected]>:
>>> 2016-10-25 19:43 GMT+08:00 Radim Krčmář <[email protected]>:
>>>> I will have some comments, because it would be nicer if it measured the
>>>> latency ... expected_expiration is not computed correctly.
>>>
>>> It measured the latency from guest programs the clock event device to
>>> interrupt injected to guest after timer fire.
>
> No. It never computed the time when the timer fires, the test measured
> the duration of the period.
>
> Imagine that the dashed line below is a timeline. Pipe is idealized
> firing of the periodic timer and caret is the time when the guest read
> time in the interrupt. The number below caret is the latency.
>
> The period is 7.
>
> --------------------------------------------
> | | | | | | |
> ^ ^ ^ ^ ^ ^ ^
> 1 2 3 1 2 1 1
>
> The test would report "latencies" as:
>
> 1 1 1 -2 1 -1 0
>
> because it used now() + period to compute the next expected expiration
>
> Similarly in this case,
> --------------------------------------------
> | | | | | | |
> ^ ^ ^ ^ ^ ^
> 6 6 6 6 6 6
>
> The latency is always 6, but the test would report
>
> 6 0 0 0 0 0
>
> And if we improved the latency by 1, you'd only see the difference in
> the first number. The test measured the duration of the period.

Agreed, thanks for the details. :)

Regards,
Wanpeng Li