Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751654AbdG1FPP (ORCPT ); Fri, 28 Jul 2017 01:15:15 -0400 Received: from szxga02-in.huawei.com ([45.249.212.188]:10792 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750935AbdG1FPN (ORCPT ); Fri, 28 Jul 2017 01:15:13 -0400 Message-ID: <597AC847.6050109@huawei.com> Date: Fri, 28 Jul 2017 13:14:47 +0800 From: "Longpeng (Mike)" User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:11.0) Gecko/20120327 Thunderbird/11.0.1 MIME-Version: 1.0 To: Paolo Bonzini CC: , , Huangweidong , Gonglei , wangxin , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , Subject: Re: [PATCH 3/4] KVM: VMX: simplify and fix vmx_vcpu_pi_load References: <20170606105707.23207-1-pbonzini@redhat.com> <20170606105707.23207-4-pbonzini@redhat.com> <597ABBEF.2080802@huawei.com> In-Reply-To: <597ABBEF.2080802@huawei.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.177.246.209] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020205.597AC856.013A,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 3f674909845bc36ba08414e2fc00edd9 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5332 Lines: 166 On 2017/7/28 12:22, Longpeng (Mike) wrote: > > > On 2017/6/6 18:57, Paolo Bonzini wrote: > >> The simplify part: do not touch pi_desc.nv, we can set it when the >> VCPU is first created. Likewise, pi_desc.sn is only handled by >> vmx_vcpu_pi_load, do not touch it in __pi_post_block. >> >> The fix part: do not check kvm_arch_has_assigned_device, instead >> check the SN bit to figure out whether vmx_vcpu_pi_put ran before. >> This matches what the previous patch did in pi_post_block. >> >> Cc: Longpeng (Mike) >> Cc: Huangweidong >> Cc: Gonglei >> Cc: wangxin >> Cc: Radim Krčmář >> Signed-off-by: Paolo Bonzini >> --- >> arch/x86/kvm/vmx.c | 68 ++++++++++++++++++++++++++++-------------------------- >> 1 file changed, 35 insertions(+), 33 deletions(-) >> >> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >> index 0f4714fe4908..81047f373747 100644 >> --- a/arch/x86/kvm/vmx.c >> +++ b/arch/x86/kvm/vmx.c >> @@ -2184,43 +2184,41 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu) >> struct pi_desc old, new; >> unsigned int dest; >> >> - if (!kvm_arch_has_assigned_device(vcpu->kvm) || >> - !irq_remapping_cap(IRQ_POSTING_CAP) || >> - !kvm_vcpu_apicv_active(vcpu)) >> + /* >> + * In case of hot-plug or hot-unplug, we may have to undo >> + * vmx_vcpu_pi_put even if there is no assigned device. And we >> + * always keep PI.NDST up to date for simplicity: it makes the >> + * code easier, and CPU migration is not a fast path. >> + */ >> + if (!pi_test_sn(pi_desc) && vcpu->cpu == cpu) >> + return; > > > Hi Paolo, > > I'm confused with the following scenario: > > (suppose the VM has a assigned devices) > step 1. the running vcpu is be preempted > --> vmx_vcpu_pi_put [ SET pi.sn ] > step 2. hot-unplug the assigned devices > step 3. the vcpu is scheduled in > --> vmx_vcpu_pi_load [ CLEAR pi.sn ] > step 4. the running vcpu is be preempted again > --> vmx_vcpu_pi_put [ direct return ] > step 5. the vcpu is migrated to another pcpu > step 6. the vcpu is scheduled in > --> vmx_vcpu_pi_load [ above check fails and > continue to execute the follow parts ] > > I think vmx_vcpu_pi_load should return direct in step6, because > vmx_vcpu_pi_put in step4 did nothing. > So maybe the above check has a potential problem. Oh! Sorry, please just ignore the above. I made a mistaken. > > Please kindly figure out if I misunderstand anything important :) > > -- > Regards, > Longpeng(Mike) > >> + >> + /* >> + * First handle the simple case where no cmpxchg is necessary; just >> + * allow posting non-urgent interrupts. >> + * >> + * If the 'nv' field is POSTED_INTR_WAKEUP_VECTOR, do not change >> + * PI.NDST: pi_post_block will do it for us and the wakeup_handler >> + * expects the VCPU to be on the blocked_vcpu_list that matches >> + * PI.NDST. >> + */ >> + if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR || >> + vcpu->cpu == cpu) { >> + pi_clear_sn(pi_desc); >> return; >> + } >> >> + /* The full case. */ >> do { >> old.control = new.control = pi_desc->control; >> >> - /* >> - * If 'nv' field is POSTED_INTR_WAKEUP_VECTOR, there >> - * are two possible cases: >> - * 1. After running 'pre_block', context switch >> - * happened. For this case, 'sn' was set in >> - * vmx_vcpu_put(), so we need to clear it here. >> - * 2. After running 'pre_block', we were blocked, >> - * and woken up by some other guy. For this case, >> - * we don't need to do anything, 'pi_post_block' >> - * will do everything for us. However, we cannot >> - * check whether it is case #1 or case #2 here >> - * (maybe, not needed), so we also clear sn here, >> - * I think it is not a big deal. >> - */ >> - if (pi_desc->nv != POSTED_INTR_WAKEUP_VECTOR) { >> - if (vcpu->cpu != cpu) { >> - dest = cpu_physical_id(cpu); >> - >> - if (x2apic_enabled()) >> - new.ndst = dest; >> - else >> - new.ndst = (dest << 8) & 0xFF00; >> - } >> + dest = cpu_physical_id(cpu); >> >> - /* set 'NV' to 'notification vector' */ >> - new.nv = POSTED_INTR_VECTOR; >> - } >> + if (x2apic_enabled()) >> + new.ndst = dest; >> + else >> + new.ndst = (dest << 8) & 0xFF00; >> >> - /* Allow posting non-urgent interrupts */ >> new.sn = 0; >> } while (cmpxchg(&pi_desc->control, old.control, >> new.control) != old.control); >> @@ -9259,6 +9257,13 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id) >> >> vmx->msr_ia32_feature_control_valid_bits = FEATURE_CONTROL_LOCKED; >> >> + /* >> + * Enforce invariant: pi_desc.nv is always either POSTED_INTR_VECTOR >> + * or POSTED_INTR_WAKEUP_VECTOR. >> + */ >> + vmx->pi_desc.nv = POSTED_INTR_VECTOR; >> + vmx->pi_desc.sn = 1; >> + >> return &vmx->vcpu; >> >> free_vmcs: >> @@ -11249,9 +11254,6 @@ static void __pi_post_block(struct kvm_vcpu *vcpu) >> else >> new.ndst = (dest << 8) & 0xFF00; >> >> - /* Allow posting non-urgent interrupts */ >> - new.sn = 0; >> - >> /* set 'NV' to 'notification vector' */ >> new.nv = POSTED_INTR_VECTOR; >> } while (cmpxchg(&pi_desc->control, old.control, > > -- Regards, Longpeng(Mike)