Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754154AbcJMLju (ORCPT ); Thu, 13 Oct 2016 07:39:50 -0400 Received: from mail-lf0-f65.google.com ([209.85.215.65]:34763 "EHLO mail-lf0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755004AbcJMLib (ORCPT ); Thu, 13 Oct 2016 07:38:31 -0400 MIME-Version: 1.0 In-Reply-To: <20161012174109.GF30525@potion> References: <1476255172-3357-1-git-send-email-wanpeng.li@hotmail.com> <1476255172-3357-3-git-send-email-wanpeng.li@hotmail.com> <20161012174109.GF30525@potion> From: Wanpeng Li Date: Thu, 13 Oct 2016 19:38:12 +0800 Message-ID: Subject: Re: [PATCH RFC V2 2/2] KVM: x86: Support using the VMX preemption timer for APIC Timer periodic/oneshot mode To: =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= Cc: "linux-kernel@vger.kernel.org" , kvm , Paolo Bonzini , Yunhong Jiang , Wanpeng Li Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by mail.home.local id u9DBdtYg017899 Content-Length: 5945 Lines: 155 2016-10-13 1:41 GMT+08:00 Radim Krčmář : > 2016-10-12 14:52+0800, Wanpeng Li: >> From: Wanpeng Li >> >> Most windows guests still utilize APIC Timer periodic/oneshot mode >> instead of tsc-deadline mode, and the APIC Timer periodic/oneshot >> mode are still emulated by high overhead hrtimer on host. This patch >> converts the expected expire time of the periodic/oneshot mode to >> guest deadline tsc in order to leverage VMX preemption timer logic >> for APIC Timer tsc-deadline mode. >> >> Cc: Paolo Bonzini >> Cc: Radim Krčmář >> Cc: Yunhong Jiang >> Signed-off-by: Wanpeng Li >> --- >> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c >> @@ -1101,7 +1101,14 @@ static u32 apic_get_tmcct(struct kvm_lapic *apic) >> apic->lapic_timer.period == 0) >> return 0; >> >> - remaining = hrtimer_get_remaining(&apic->lapic_timer.timer); >> + if (kvm_lapic_hv_timer_in_use(apic->vcpu)) { > > apic->lapic_timer.expired_period is set unconditionally, so the > condition seems superfluous. Agreed. > >> + ktime_t now; >> + >> + now = apic->lapic_timer.timer.base->get_time(); >> + remaining = ktime_sub(apic->lapic_timer.expired_period, now); > > The name expired_period is weird for something that can be in the future > ... maybe target_ns? How about target_expiration? It should be ktime_t as the second parameter of hrtimer_start(). > >> + } else >> + remaining = hrtimer_get_remaining(&apic->lapic_timer.timer); >> + >> if (ktime_to_ns(remaining) < 0) >> remaining = ktime_set(0, 0); >> >> @@ -1353,8 +1360,6 @@ static void start_sw_period(struct kvm_lapic *apic) >> >> /* lapic timer in oneshot or periodic mode */ >> now = apic->lapic_timer.timer.base->get_time(); >> - apic->lapic_timer.period = (u64)kvm_lapic_get_reg(apic, APIC_TMICT) >> - * APIC_BUS_CYCLE_NS * apic->divide_count; > > I would gut start_sw_period() a bit more: the checking of minimal period > is a bit misplaced, because it uses now() when the timer switched to > hrtimers, but that could have been long after the timer was set, so we'd > delay for no good reason. Agreed. > >> if (!apic->lapic_timer.period) >> return; > > Can this happen? (WARN_ONCE() if it is only a sanity check.) Actually I catch several apic->lapic_timer.period == 0 here, in addition, the oneshot mode test of kvm-unit-tests/apic.flat will fail if change it to WARN_ONCE() instead of return directly. > >> @@ -1400,52 +1405,71 @@ bool kvm_lapic_hv_timer_in_use(struct kvm_vcpu *vcpu) >> +void kvm_lapic_expired_hv_timer(struct kvm_vcpu *vcpu) >> +{ >> + struct kvm_lapic *apic = vcpu->arch.apic; >> + >> + WARN_ON(!apic->lapic_timer.hv_timer_in_use); >> + WARN_ON(swait_active(&vcpu->wq)); >> + cancel_hv_timer(apic); >> + apic_timer_expired(apic); >> + >> + if (apic_lvtt_period(apic)) { >> + ktime_t now; >> + u64 tscl = rdtsc(); >> + >> + apic->lapic_timer.period = (u64)kvm_lapic_get_reg(apic, APIC_TMICT) >> + * APIC_BUS_CYCLE_NS * apic->divide_count; >> + >> + if (!apic->lapic_timer.period) >> + return; >> + >> + apic->lapic_timer.tscdeadline = kvm_read_l1_tsc(apic->vcpu, tscl) + >> + nsec_to_cycles(apic->vcpu, apic->lapic_timer.period); >> + now = apic->lapic_timer.timer.base->get_time(); >> + apic->lapic_timer.expired_period = ktime_add_ns(now, apic->lapic_timer.period); >> + >> + start_hv_timer(apic); >> + } >> +} >> +EXPORT_SYMBOL_GPL(kvm_lapic_expired_hv_timer); >> + >> @@ -1470,10 +1497,25 @@ static void start_apic_timer(struct kvm_lapic *apic) >> { >> atomic_set(&apic->lapic_timer.pending, 0); >> >> - if (apic_lvtt_period(apic) || apic_lvtt_oneshot(apic)) >> - start_sw_period(apic); >> - else if (apic_lvtt_tscdeadline(apic)) { >> - if (!(kvm_x86_ops->set_hv_timer && start_hv_tscdeadline(apic))) >> + if (apic_lvtt_period(apic) || apic_lvtt_oneshot(apic)) { >> + ktime_t now; >> + u64 tscl = rdtsc(); >> + >> + apic->lapic_timer.period = (u64)kvm_lapic_get_reg(apic, APIC_TMICT) >> + * APIC_BUS_CYCLE_NS * apic->divide_count; >> + >> + if (!apic->lapic_timer.period) >> + return; >> + apic->lapic_timer.tscdeadline = kvm_read_l1_tsc(apic->vcpu, tscl) + >> + nsec_to_cycles(apic->vcpu, apic->lapic_timer.period); >> + now = apic->lapic_timer.timer.base->get_time(); >> + apic->lapic_timer.expired_period = ktime_add_ns(now, apic->lapic_timer.period); > > This is the same code as in kvm_lapic_expired_hv_timer(), a helper > function is in order. Agreed. > >> + >> + if (!(kvm_x86_ops->set_hv_timer && start_hv_timer(apic))) >> + start_sw_period(apic); >> + } else if (apic_lvtt_tscdeadline(apic)) { >> + if (!(kvm_x86_ops->set_hv_timer && start_hv_timer(apic))) >> start_sw_tscdeadline(apic); >> } >> } >> @@ -1711,8 +1753,7 @@ u64 kvm_get_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu) >> { >> struct kvm_lapic *apic = vcpu->arch.apic; >> >> - if (!lapic_in_kernel(vcpu) || apic_lvtt_oneshot(apic) || >> - apic_lvtt_period(apic)) > > rdmsr MSR_IA32_TSCDEADLINE has to return 0 when the timer is not in tsc > deadline mode, so the condition should stay. (and check for > apic_lvtt_tscdeadline(), but that is a unrelated cleanup.) Agreed. I just sent out RFC V3 to handle these comments, thanks for your active review, Radim. It is really helpful. :) Regards, Wanpeng Li